On May 23 2012, at 16:58 , Ulf Zibis wrote:

> Hi,
> 
> What about making this approach a little bit more general?
> See: Bug 6812862 - provide customizable hash() algorithm in HashMap for speed 
> tuning

It is possible for JDK 8. One option is to provide the hash() method as a 
virtual extension method on Map allowing implementations to override it. 

>      + all later comments.
> Then you additionally could save:
>     if ((0 != h) && (k instanceof String))
> 
> Looking at the codes of many charsets, the main variance seems to be in the 
> lower 8 bits of a character, especially if the strings belong to the same 
> language. So if we would compose the initial 32-bit values from 4 chars then 
> the murmur3 algorithm could perform almost twice faster.

Interesting idea. The description specifically avoids defining the behaviour so 
implementations can do this kind of optimization. For 7u6, unless there are 
technical problems, the proposed. implementation is sufficient. The important 
message though is that it can be changed later.

> 
> If you alter all hash maps in JDK to use a new hash value, which noteworthy 
> use cases remain to use the legacy hashcode()?

Compatibility (which is of paramount importance). Part of the difficulty is 
that the String.hashCode() calculation method is part of the specification. 
Changing the specification is possible but probably not practical.

> Do we really need 2 hash fields in String?

Performance without caching the hash code result is unacceptable. (I've tried 
it).

> In project coin, we have set in stone to use compile time hashes for 
> Strings_in_switch extension. So it never can't profit from the murmur3 
> optimization. IMO: what a pity!
> (Prominent people have said, it will never make sense to change the String's 
> hash algorithm.)
> See: http://markmail.org/message/ig3nzmfinfuvgbwz
>      http://markmail.org/message/h3nlhhae5qlmf37a

Yes, there other other reasons as well.

Mike

> 
> 
> Am 23.05.2012 21:03, schrieb Mike Duigou:
>> 
>>> Also, this change
>>> 
>>> -        return h ^ (h>>>  7) ^ (h>>>  4);
>>> +        h ^= (h>>>  7) ^ (h>>>  4);
>>> +
>>> +        return h;
>>> 
>>> will make the compiler generates an additional iload/istore pair.
>>> While the Jitted code will be the same, it may bother the inlining 
>>> heuristic.
> Wouldn' t
>     return (h ^= (h>>>  7) ^ (h>>>  4));
> have the same effect ?
> 
> Anyway, please add a comment for later readers.
> 
> -Ulf
> 

Reply via email to