2006/7/26, Alex Blewitt <[EMAIL PROTECTED]>:
On 26/07/06, Denis Kishenko <[EMAIL PROTECTED]> wrote:
> Frequently hash code is a sum of several other hashes. Such implementation
> kill hash idea, it's terrible.

I agree that simple addition is not a good thing. However, exclusive
or can work just as well, as can a simple multiplication of numbers by
primes.

> and many others... also I have found such "algorithms" =)>
>     private long* *token;
>
>      public int hashCode() {
>         return (int) token;
>     }

This isn't an incorrect hash function, and if there's only one token,
it doesn't really matter that much. Even if the token itself isn't
widely distributed, chances are the bottom few bits are going to be
distributed fairly evenly, and often it's the bottom part of a
hashcode that's considered rather than the top part. So, it's not
optimal, but shifting some bits around doesn't make that much
difference.

> The most of hashCode() functions are implemented in different ways. It's not
> problem, but it looks like scrappy, there is no single style. And finally we
> have class *org.appache.harmony.misc.HashCode *for this aim!

Interesting, but I wonder what the impact is of using an additional
class to calculate the hash from each time is going to be. Hopefully a
decent JIT will remove much of the problem, but bear in mind that the
hashCode may get called repeatedly during a single operation with a
collection (and in the case that it's already in the collection, may
be called by every other operation on that collection too. You might
find that even a simple (say) sort of a list would easily overwhelm
the nursery generation of a generational GC.

Of course, it's difficult to say without measuring it and knowing for sure :-)

> But only several classes are using it. I suggest integrate HashCode in
> all hashCode() implementations (about 200 files), I can do this. Anybody
> else can improve HashCode work.
>
> Any comments?

There are other approaches that could be used instead of this. For
example, the value of the hashCode could be cached in a private
variable and then invalidated when any setValue() methods are called.
I agree with your concern and suggested solution.

One could even imagine a superclass performing this caching work:

public abstract class HashCode {
 private boolean valid = false;
 private int hash;
 public final int hashCode() {
   if (!valid) {
     hash = recalculateHashCode();
     valid = true;
   }
   return hash;
 }
 protected abstract int recalculateHash();
 protected void invalidate() { valid = false; }
}

Of course, you could use the HashCode object to calculate the hash value :-)

Such an approach won't work when the class hierarchy cannot be
changed, of course.
That's exactly our case in public API... But we can use it in our
private API (org.apache.harmony.*)

Lastly, an easier approach is to use a tool (such as Eclipse) to
generate the implementation of hashCode() automatically based on the
non-static, non-final variables of a class. This one sounds like (a)
the easiest, and (b) all-round better performant than any of the
above.
Good idea!

SY, Alexey


Alex.

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
Alexey A. Petrenko
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to