On 01/06/2019 14:52, Morris de Oryx wrote:
[...]
For an example, imagine an address table with 100M US street addresses with two character state abbreviations. So, say there are around 60 values in there (the USPS is the mail system for a variety of US territories, possessions and friends in the Pacific.) Okay, so what's the best index type for state abbreviation? For the sake of argument, assume a normal distribution so something like FM (Federated States of Micronesia) is on a tail end and CA or NY are a whole lot more common.

[...]

I'd expect the distribution of values to be closer to a power law than the Normal distribution -- at very least a few states would have the most lookups.  But this is my gut feel, not based on any scientific analysis!


Cheers,
Gavin



Reply via email to