Re: [HACKERS] Why hash indexes suck

pgsql Sun, 06 Jun 2004 06:53:36 -0700

> Sailesh Krishnamurthy <[EMAIL PROTECTED]> writes:
>> This is probably a crazy idea, but is it possible to organize the data
>> in a page of a hash bucket as a binary tree ?
>
> Only if you want to require a hash opclass to supply ordering operators,
> which sort of defeats the purpose I think.  Hash is only supposed to
> need equality not ordering.
>


A btree is frequently used within the buckets of a hash table, expecially
if you expect to have a large number of items in each bucket.

If PostgreSQL could create a hash table index which is a single top level
hash table with each hash bucket being a btree index. You can eliminate a
number of btree searches by hashing, and then fall into btree performance
after the first hash lookup. The administrator should be able to gather
statistics about the population of the hash buckets and rehash if
performance begins to behave like a btree or the data is not distributed
evenly. Given proper selection of the initial number of buckets, a hash
table could blow btree out of the water. Given a poor selection of the
number of buckets, i.e. 1, a hash will behave no worse than a btree.

Also, it would be helpful to be able to specify a hash function during the
create or rehash, given a specific class of data, extraction of the random
elements can be more efficient and/or effective given knowledge of the
data. Think something like "bar codes," there are portions of the data
which are ususally the same and portions of the data which are usually
different. Focusing on the portions of the data which tend to be different
will generally provide a more evenly distributed hash.









---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Re: [HACKERS] Why hash indexes suck

Reply via email to