On 29/12/15 10:23, Dimitry Sibiryakov wrote: > 24.12.2015 18:29, Alex Peshkoff wrote: >> If we have one chain with 12 objects, three empty chains and 9999 chains >> with 1, 2 or 3 objects - it's not bad. >> I.e. existing stat-s is not enough for analysis. > > I've got following data from Igor Valchenko: > >> Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8 >> Hash lengths distribution: >> 0 : 13761 >> 1 : 21925 >> 2 : 16666 >> 3 : 8675 >> 4 : 3219 >> 5 : 955 >> 6 : 253 >> 7 : 53 >> 8 : 14 > > What would you say? > We have in 1st place 21925 + 16666 + 8675 + 3219 + 955 + 253 + 53 + 14 = 51760 * 0 = 0
We have in 2nd place 16666 + 8675 + 3219 + 955 + 253 + 53 + 14 = 29835 * 1 = 29835 In 3rd place 8675 + 3219 + 955 + 253 + 53 + 14 = 13169 * 2 = 26338 In 4th place 3219 + 955 + 253 + 53 + 14 = 4494 * 3 = 13482 In 5th place 955 + 253 + 53 + 14 = 1275 * 4 = 5100 In 6th place 253 + 53 + 14 = 320 * 5 = 1600 In 7th place 53 + 14 = 67 * 6 = 402 In 8th place 14 = 14 * 7 = 98 So we have 100,934 entries. To access each entry once from scratch, we will need to follow 76,855 "next" links. So that gives us a score of 0.76. A perfect score is (100,934 - 65521) / 100,934 ~= 0.35 So to score 0.76 isn't bad. We can't compare that with my Pick score because I would have used double the number of buckets, but if we keep the same number of buckets and items, this figure will give us an idea of how good the hash is. Note however, that if the characteristics of the data being hashed changes, so will the distribution, and hence the score ... this is always tricky territory to get right. Cheers, Wol ------------------------------------------------------------------------------ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel