Check out our paper in Scientific Reports published on the 6th November. Click to Open Access, so no fear of breaking copyright by reading it ! What a surprise. We or rather a grant paid for it. What we argue here is that if you examine a system of objects which follows some sort of skewed distribution that can be approximated by a power law, then if this system is incomplete in some way, then there is no way a power law can ever fit. For example, if you have a system of cities that excludes some that are clearly part of the system in terms of its evolution, a power law cannot result. We show that first by taking an income distribution which is generated from a pure power law, then break it into two and exclude the largest objects. Those left follow an entirely different distribution from the complete distribution. It is no longer scaling and no longer rank size in the Zipfian sense.
This means that if you have a system of cities that excludes some, a power law is also an incomplete representation of it. Pretty obvious really but surprising how little this issue has been discussed. A good example is the city size distribution in the United States which excludes Canadian cities and in this case, we get something near a power law but not quite. Now to correct for this phenomenon, we show how you can sample from the distribution to base an approximation on a subset of values which mirrors the true distribution. Of course to do this, you have to assume the distribution is a power law and we never know whether or not this is the case. Catch 22 as ever. Nevertheless, our argument is completely general and it casts doubt on much of the work on fitting power laws to rank size and related distributions that has been carried out over the last 50 or more years. Food for thought at least. The idea that the system is incomplete in some way is much more general than power laws per se as it suggests that we need to pay attention to the system and figure out in advance how complete it is likely to be, or rather how complete the data we have is. We don’t really do this too much in our field. We should. Read On.