On 29/12/15 10:23, Dimitry Sibiryakov wrote:
> 24.12.2015 18:29, Alex Peshkoff wrote:
>> If we have one chain with 12 objects, three empty chains and 9999 chains
>> with 1, 2 or 3 objects - it's not bad.
>> I.e. existing stat-s is not enough for analysis.
> 
>    I've got following data from Igor Valchenko:
> 
>> Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8
>> Hash lengths distribution:
>>      0 : 13761
>>      1 : 21925
>>      2 : 16666
>>      3 : 8675
>>      4 : 3219
>>      5 : 955
>>      6 : 253
>>      7 : 53
>>      8 : 14
> 
>    What would you say?
> 
We have in 1st place
21925 + 16666 + 8675 + 3219 + 955 + 253 + 53 + 14 = 51760 * 0 = 0

We have in 2nd place
16666 + 8675 + 3219 + 955 + 253 + 53 + 14 = 29835 * 1 = 29835

In 3rd place
8675 + 3219 + 955 + 253 + 53 + 14 = 13169 * 2 = 26338

In 4th place
3219 + 955 + 253 + 53 + 14 = 4494 * 3 = 13482

In 5th place
955 + 253 + 53 + 14 = 1275 * 4 = 5100

In 6th place
253 + 53 + 14 = 320 * 5 = 1600

In 7th place
53 + 14 = 67 * 6 = 402

In 8th place
14 = 14 * 7 = 98

So we have 100,934 entries. To access each entry once from scratch, we
will need to follow 76,855 "next" links. So that gives us a score of 0.76.

A perfect score is (100,934 - 65521) / 100,934 ~= 0.35

So to score 0.76 isn't bad.

We can't compare that with my Pick score because I would have used
double the number of buckets, but if we keep the same number of buckets
and items, this figure will give us an idea of how good the hash is.
Note however, that if the characteristics of the data being hashed
changes, so will the distribution, and hence the score ... this is
always tricky territory to get right.

Cheers,
Wol


------------------------------------------------------------------------------
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to