There is a very interesting mathematical pattern known as “Benford’s Law”. This Law all started in the year 1085 when William the Conqueror wanted to know how many people lived in his land, and other things about them such as what they owned, what taxes they paid, and how much they made. These results were called the Domesday Book. The conclusion made through this “Final Tally” was that the number one appears the most frequently and then two and so on. This was proven in many different ways.
This pattern was first noticed by a Canadian-born American astronomer named Simon Newcomb. In the year 1881, Necomb published a brief note in the American Journal of Mathematics, talking about this pattern. He said that he got this conclusion by observing fraying books of logarithm tables. The results are as follows: 1. 30.1%, 2. 17.6%, 3. 12.5%, 4. 9.7%, 5. 7.9%, 6. 6.7%, 7. 5.8%, 8. 5.1%, 9. 4.6%. You can see here that the frequency of the number decreases the higher the number gets. This data did not use the number zero. The reason for this is unknown. This data shows that the number one appears almost seven times more likely than the number nine. Newcomb did not rigorously prove why this happens but rather wrote about it with curiosity. The reason for this was revealed in 1938, when a physicist at the General Electric Company in New York, Frank Benford, rediscovered this paradox. Benford stated that he was unaware of Necomb’s paper.
Unlike Newcomb, Benford looked into more resources than just fraying books of logarithm tables, but everything he could get his hands on. Some things that Benford looked into were: US city population tables, addresses of people, atomic weights of elements, tables of the areas of rivers and from expected distribution, and many more. Benford said that this phenomenon must be evidence of a universal law, and named it the Law of Anomalous Numbers. This name did not stick, but it was instead called “Benford’s Law”. In the book, the author talks about a financial investigator who finds corrupt data. This man is named Darrel D. Dorrel and he uses Benford’s law to help him find financial fraud. (See the Ben Affleck film The Accountant for more on this).
Many people have used Benford’s Law to help them with various areas. Scott de Marchi and James T. Hamilton of Duke University used it to follow levels of lead and tantric acid emissions. Walter Mebane, who is a political scientist at the University of Michigan, used Benford’s Law as evidence of his suggestion that the Iranian presidential election of 2009 was rigged. Many scientists used Benford’s Law as a diagnostic tool. They use it for things such as measuring earthquakes.
Benford also did something that Newcomb did not. He added zero to his data. The new data is as follows: 0. 12% 1. 11.4%, 2. 10.9%, 3. 10.4%, 4. 10%, 5. 9.7%, 6. 9.3%, 7. 9%, 8. 8.8%, 9. 8.5%. There is a simple way to explain how Benford’s Law works. If you count from 1 to 20, you will see the pattern. More than half of the digits start with a 1, since 10 to 19 all do. If we continue counting upwards, we will always have passed at least as many numbers that begin with a 1 as start with a 2. If we go to the twenties, two hundreds, or even two thousands, we will have counted through the tens, hundreds, and thousands (This rule also applies to every following number).
A very important part of Benford’s Law is that the pattern of numbers is independent of the units of measure. If a financial data set follows Benford’s Law in US Dollars, then it will do so in another currency, such as Pounds Sterling. This property is called scale invariance. A simple way to look at the scale invariance of Benford’s Law is to look at how numbers behave when doubled. If a number begins with a 1, that number doubled will start with either a 2 or 3. If a number begins with a 2, that number doubled will start with either a 4 or 5 and so on. This is until we reach 5. After this, the numbers 5-9 doubled will all start with a 1. Here is a data table to further understand scale invariance:
First Digit of x |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
First digit of 2x |
2 or 3 |
4 or 5 |
6 or 7 |
8 or 9 |
1 |
1 |
1 |
1 |
1 |
Benford’s percentage |
30.1 |
17.6 |
12.5 |
9.7 |
7.9 |
6.7 |
5.8 |
5.1 |
4.6 |
If you add up the percentages of the numbers 5-9 (7.9, 6.7, 5.8, 5.1, 4.6) you get 30.1. When any number that begins with 5-9, the result will always be the starting number as one. 30.1 is Benford’s percentage as shown on the data table. If we did this with any number instead of 2, such as pi, it would still follow Benford’s Law.
For many decades, Benford’s Law was considered as a gimmick for magic shows or a quirk of data. This all changed in the 1990’s A man named Ted Hill, a professor at Georgia Tech, wanted to find a theoretical explanation of exactly why and how Benford’s Law worked. Dr. Hill made a game out of his findings. In the game, Hill picks a number, and someone else picks a number (1-9). Then both numbers are multiplied together. If the number starts with 1, 2, or 3, Ted wins. If they do not (start with 4-9), then Ted loses. The game seems to be weighted in Ted’s opponents favor. But according to Benford’s Law, this is not true. This is because the number one will be the result 30.1% of the time. 2 would be 17.6% of the time; and 3 would be 12.5% of the time. These percentages add up to 60.2 percent. This means that Ted would win 60.2% of the time, meaning he would win more. This game illustrates the fundamental principle of Benford’s Law: the distribution of leading digits is not uniform, but rather follows a predictable pattern that is independent of the scale or units of measurement. This property of scale invariance underlies the widespread applicability and relevance of Benford’s Law in various and practically all fields, such as finance, economics, and data analysis.