Using Data Analytics for Audit
Three Time Saving Ideas for Auditors
by Mike Blakely
The need for more efficient and effective analytical procedures has become increasingly apparent. This due to factors such as the economy, perceived lack of experienced auditors, etc.
The focus this month is on three techniques for ways perform analytical procedures more efficiently:
- Where readily feasible, produce charts automatically to help
highlight patterns in the data being tested. This is illustrated in
a brief discussion of the Madoff Ponzi scheme.
- Use tools that can work with the data natively. To illustrate, a
free tool for use with Excel data is discussed, including the 19 or
so procedures built in.
- Reduce or eliminate needless work. A little known technique called "sequential sampling" practically eliminates the need for computing sample size and virtually assures that over-sampling will not occur. Yet the results produced form a statistically valid basis for conclusions.
- The Madoff Ponzi Scheme
- A quick way to perform various tests on data in Excel work sheets
- Sequential random sampling
The Madoff Ponzi Scheme
The Ponzi scheme perpetrated by Bernard L. Madoff Investment Securities LLC reported stock trades and investments for years, but the scheme went undetected for a long time. One of the questions almost everyone is asking is "Couldn't this scheme have been detected earlier?". The answer is a possible yes. Since the investment firm was supposedly trading securities, then the investment returns should have followed Benford's Law.
Benford's Law is a mathematical algorithm often used for detecting made up numbers. In a nutshell, Benford's Law looks at the first digit of a reported amount, then tallies up the count of all these first digits in a population so that the actual tally can be compared against what a theoretical tally should be.
Try this exercise which uses Benford's Law. Start by looking at the annual returns for the S&P 500 over the last fifty years or so. This can be done by taking the data gathered at Indiana University Northwest and running a test using Benford's Law for large stock returns. For each year look at the reported return. For example, if the return was 7.24%, then the first digit is 7, so a tally would be made for the digit 7. Similarly, if the return were 12.54%, the first digit would be one. With Benford's Law, the first digit "1" would be expected to occur more than "2", which would occur more than the digit "3", etc.
If the results were tabulated and charted, you could see a comparison of the actual counts obtained versus what would be expected as shown below.

The chart shows that there appears to be a good correlation between actual and expected. This tends to confirm that stock market returns will conform with Benford's Law. Statistically, this is also confirmed by the low Chi Square value of 1.6607, which indicates there is a strong correlation between the two sets of data points.
You will also get similar results if you look at the returns for small stocks for the same period:

The correlation is not as good, but the statistical measure (Chi Square) is still relatively low.
But if you were to then look at the returns for the Madoff investment funds you would very likely see something much different.

It's clear that there is no correlation and the high Chi Square value confirms this. This could be a red flag that the reported investment returns were not what would be expected.
There is more information about Benford's law.
A quick way to perform various tests on data in Excel
Performing common audit tasks and analyzing data containing on Excel worksheets can now be as easy as 1,2 3.... This is done by 1) opening an Excel worksheet, 2) selecting a range and copying the data to the clipboard and pasting the data into the Audit Commander and 3) analyzing it with the Audit Commander. The results can be then either be viewed or pasted back into Excel!
The Audit Commander Excel Sheet analyzer program is completely free. To illustrate its possible use, consider an Excel data sheet shown below and containing fixed asset record data (highlighted area is to be analyzed):

With Excel open, also open a blank Audit Commander form, as shown below:

Then select the area in the worksheet to test (i.e. columns, ranges or specific cells) and copy it to the windows clipboard using either 1) the Excel menu item "edit|copy", 2) the keyboard shortcut "Control-C" or 3) right mouse click and select "Copy". Then click inside the Audit Commander text box form and "paste" the data either by clicking the "paste" button" or else using the keyboard shortcut "Control-V". At this point the form will display the Excel data which has been "pasted" into the form (not all of which can be seen).
The tests which can be performed on excel worksheets are summarized below. More information on each of the tests is available as shown in the sidebar links. (It started as a dozen, then grew)
1. Statistics - obtain basic statistics for the
numeric values selected. Uses include sample planning, analytic review,
reasonableness testing, etc.
2. Round Numbers - summarize the number and type of
round numbers - generally used to to check for estimates or unusual
amounts among purchases orders, invoices, etc.
3. Trend Line - determine a trend line using "best fit"
for a series of data points. Used for checking trends over time, unusual
data points, etc.
4. Stratification - classify numeric amounts into
bucket ranges. Generally used for reasonableness testing, sample
planning, etc.
5. Linear regression - test the relationship between to
variables. Used to identify unusual amounts in situations where there
should be a relationship between two amounts, e.g. sales and cost of
goods sold.
6. Weekday classification - tallies the counts of each
date in the range selected by day of the week. used for checking for
unusual situations or simply to see where transactions fall.
7. Holiday checking - checks for dates in the
transaction range falling on federal holidays. used to spot potentially
unusual transactions.
8. Benford's Law - used to test amounts using Benford's
law. Generally associated with testing for "made up" numbers.
9. Duplicates - a test for any duplicate values can be
performed. If any are found, a report shows the values and counts for
all such items identified.
10. Summary - numeric values can be summarized by a
specified value. It is not necessary to "pre-sort" the data. The summary
report includes not only sub-totals, but counts, minimum and maximum
values, average, etc.
11. TopN - for numeric values, a report can be prepared
of the largest 5, 10, 20 etc. items. The auditor specifies the number of
items to be obtained. It is not necessary to "pre-sort" the data.
12. BottomN - for numeric values, a report can be
prepared of the smallest 5, 10, 20 etc. items. The auditor specifies the
number of items to be obtained. It is not necessary to "pre-sort" the
data.
13. Ageing - Classify amounts based upon a date column. Amounts are placed into "buckets" of 30 days.
14. Tiny Difference - identify values whose difference from that of a specified value is "tiny". Used in searches for anomalies.
15. Fuzzy Matching - sometimes you're trying to match a value using "wild cards". This technique uses the "regular expression" function built into Excel.
16. Day of Week - allows extraction of data based upon a date column falling on one or more of the specified days of the week.
17. Time of Day - allows for data extraction of transactions falling within specified hours of the day. Useful in looking for anomalies.
18. Pareto Analysis - Uses the classic "80/20" rule to classify and check data. Useful in isolating the data where the focus should be.
19. Gaps - Checking for missing document (or other) numbers which should be complete and in numerical order. Useful for fraud investigations.
More information is available...
Sequential Sampling - Banana Aphids and other beasts
Has this ever happened to you? You need to perform a random sample to perform an audit test. You spend time trying to determine the optimal sample size. Various software packages make different determinations as to the required sample size. But you go ahead and pull the sample size recommended, only to find out at the end that the sample size was too large. Although you accomplished your audit objective, you and your staff may have spent too much time.
There is another approach which you might consider - it's called sequential sampling. From a statistical perspective it's no different than what you've been doing - you're going to get the same sampling results which will provide a valid basis for statistical estimates. But the difference in the approach is that you don't bother with calculating required sample sizes. And the approach practically guarantees that you won't over sample.
In a nutshell, here's how the approach works. You take a small arbitrary size sample using auditor judgment. This sample size might be 15, 20, 30 etc. You then evaluate the sample results which will likely indicate that your sample size is too small to achieve your sampling objectives. You then select another small sample whose size is determined by auditor judgment. You repeat the process until you achieve your sampling objective, i.e. desired precision and confidence.
There's been several articles written on the subject. This approach lends itself to both attribute and variable sampling, unrestricted and stratified. There's a brief article on the subject with some graphics provided by entomologists at the University of Hawaii who were investigating banana aphid infestations in Hawaii (banana aphids are the bane of the banana crop, one of the main agricultural crops of Hawaii).
Compliance testing often relies on attribute sampling when a test is to be based upon a random sample. If the entire population is expected to be consistent, then a sample will be drawn from the entire population, i.e. an unrestricted sample. This is in contrast to situations where there is some variation expected to occur, in which case a stratified sample may be drawn.
If the supporting documents for data being audited are contained in a central location, e.g. no travel or other logistics are involved, then stop and go sampling may be a more efficient and effective method for random sampling for the following reasons:
- There is no need to compute a required sample size,
- There is no need to perform a preliminary analysis of the population attributes such as expected error rate, and
- There is little or no risk in "over sampling", i.e. testing more samples than required and therefore spending excess audit time doing the testing.
Stop and Go sampling is a statistically valid process which involves the following steps:
- Assign a random number to each item in the population (e.g. using "Mersenne Twister" or other statistically valid random number generator)
- Sort the population by assigned random number, either ascending or descending
- Select the first 10 - 20 items (auditor judgment as to number), test them and put the results into an Excel spreadsheet.
- Run a "stop and go" sample report and review the results (see example below)
- If the resulting sample precision is too large, then select another group of transactions by sorted assigned random number (auditor judgment as to number)
- Test the samples and record the results in the same Excel spreadsheet.
- Run another "stop and go" sample an review the results.
- Repeat steps 5 through 7 until satisfactory results have been obtained.
The report from the Stop and Go Sample will show the intermediate results, sample statistics as well as calculate the estimate of the population at four confidence levels - 80%, 90%, 95% and 98%. The results will also be charted for easy review. The charts show the upper and lower bounds, as well as the point estimate for each calculation.
The procedure for determining the results from attribute sampling consists of these steps:
- Specify the data source (Excel, Access, text file)
- Specify the name of the workbook or file
- Select the work sheet or table which contains the data
- Specify the names of the columns containing the sample indicator and the sample results
- Select the "Attribute Sample - Unrestricted" command from the drop down list
- Click the "OK" button
- Review the report under the "Report" tab
Output is in three formats:
- Electronic work paper (see example)
- Chart (below)
- Text report
An example of the chart output is shown below (attribute test for signature on documents as tested in 25 samples):

There is more information available on sequential sampling.
The opinions, beliefs and viewpoints expressed by the various authors and forum participants on this web site do not necessarily reflect the opinions, beliefs and viewpoints of AuditNet®
