Dave Korn wrote:
On 19/03/2010 22:07, Robert Dewar wrote:

You miss my point, doing a mod with 256 is an AWFUL hash algorithm
since it only uses the low order 8 bits!

  This statement is only true contingent on a number of significant
assumptions you haven't stated - assumptions which can easily be violated.

I did make the mistake of assuming that since JHK talked about efficiency of the mod operation he was assuming mod as part of
the hashing algorithm, rather than just a way of narrowing the
range of the result of an otherwise already pseudo-randomized hash.

I think you will find that people on this mailing list know all about
hash tables

  I think you should get down from that high horse before you come down with
an embarrassing bump.

So this does not get around the possibility of a bad luck worst case.

  Perfect hashing does exactly that.  That's why it's "popular for hashing
keywords for compilers", and indeed why it "ought to be popular for optimizing
switch statements":

I still doubt that it is worth while in practice, it will be interesting to see whether actual measurements show otherwise. The experience with
the BCPL compiler was that in practice though clever, this approach was
not actually helpful in real programs.

I can't see perfect hashing EVER being useful for hashing keywords for
compilers, since in practice given good error detection you don't know
whether you are looking for a keyword or an identifier in any context.
So almost always you just incorporate the hash entries for keywords into
the main identifier hash table. Are there really compilers which work
any other way? Given that you have to treat nicely the case of programmers using keywords as identifiers accidentally, and misspelling
keywords accidentally:

     1. pakage X is
        |
        >>> incorrect spelling of keyword "package"

     2.    type Package is range 1 .. 10;
                |
        >>> reserved word "package" cannot be used as identifier

     3. end X;

I really don't see any other approach

(hmmm, now I have to file a bug report about the inconsistent
terminology of keyword vs reserved word :-))



http://burtleburtle.net/bob/hash/perfect.html

    cheers,
      DaveK

Reply via email to