Kenneth Marshall <[EMAIL PROTECTED]> writes: > On Thu, Sep 04, 2008 at 02:01:18PM -0400, Tom Lane wrote: >> I think what the hash index patch really needs is some performance >> testing. I'm willing to take responsibility for the code being okay >> or not, but I haven't got any production-grade hardware to do realistic >> performance tests on. I'd like to see a few more scenarios tested than >> the one provided so far: in particular, wide vs narrow index keys and >> good vs bad key distributions.
> Do you mean good vs. bad key distributions with respect to hash value > collisions? Right, I'm just concerned about how badly it degrades with hash value collisions. Those will result in wasted trips to the heap if we omit the original key from the index. AFAICS that's the only downside of doing so; but we should have an idea of how bad it could get before committing to doing this. > Currently, since a trip to the heap is required to validate any tuple > even if the exact value is contained in the index, it does not seem > like it would be a win to store the value in both places. The point is that currently you *don't* need a trip to the heap if the key doesn't match, even if it has the same hash code. > I think that increasing the number of > hash bits stored would provide more bang-for-the-buck than storing > the exact value. Maybe, but that would require an extremely invasive patch that breaks existing user-defined datatypes. You can't just magically get more hash bits from someplace, you need datatype-specific hash functions that will provide more than 32 bits. There's going to have to be a LOT of evidence in support of the value of doing that before I'll buy in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers