Lower values (closer to 100%) are better:
A random hash should distribute the keys equally to all buckets, so it would need the minimal number of comparisons. Real world distributions will have some buckets fuller than others. Since the number of comparisons (for visiting each element exactly once) is quadratic with the number of elements in a bucket (linear search), the additional comparisons for the longer chains will not be offset by the savings of comparisons from shorter chains. But don't worry about it too much, as you can't really do anything about it anyways. The only reason for Devel::Peek to print this metric is to allow Perl5 Porters to play with different implementations of the hashing function. Cheers, -Jan From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Tuesday, January 19, 2010 1:19 PM To: [email protected] Subject: Devel::Peek -- hash "Quality" query Dear gurus, >From the perldoc for Devel::Peek: The "quality" of a hash is defined as the total number of comparisons needed to access every element once, relative to the expected number needed for a random hash. The value can go over 100%. The total number of comparisons is equal to the sum of the squares of the number of entries in each bucket. For a random hash of "n" keys into "k" buckets, the expected value is: n + n(n-1)/2k I'm working on some code documentation and have gotten to this point. The above is vaguely worded. Or maybe I should just say that it doesn't explicitly state what I'm looking for... So, do I want a higher percentage for hash "quality," or a lower percentage? That is, which direction of change signals improvement? TIA, Deane Rothenmaier Programmer/Analyst Walgreens Corp. 224-542-5150 You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln
_______________________________________________ ActivePerl mailing list [email protected] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
