Re: [HACKERS] Hash index todo list item

Heikki Linnakangas Tue, 06 Nov 2007 03:01:11 -0800

Shreya Bhargava wrote:

1. Populate the table with 80 million tuples.
2. Create HASH index on the table.
3. clear both linux cache & psql buffers.
   (exiting psql and restarting it cleared the psql buffers;
    to clear linux cache, we used drop_cache command)
4. start psql
5. select on an integer in the range of values in the table.
   (all test numbers were big ones, like 98934599)
6. record the time.
7. exit psql.
8. drop caches.(as described above)
9. repeat 4-8 for different numbers.
10. Drop Hash index.
11. Create Btree index and repeat 3-9.

It seems you're mostly measuring the overhead of starting a backend,populating the relcache etc.

Restarting psql doesn't clear the postgres shared buffer cache. Or didyou mean that you restarted postgres?

Anyway, I don't think it's interesting to test with cleared caches.Surely the metapage and first 1-2 levels of the b-tree would stay cachedall the time in real life.

From the results obtained, the average of all the hash probes is 141.8ms, the 
average for btree is 168.5, a difference of about 27.The standard deviations 
are about 23, so this is a statistically significant difference.

I don't trust those numbers much, but in any case I don't think thatedge is big enough to justify the existence of hash indexes.

If you're looking for a use case where hash index is faster, I'd suggestusing a data type with an expensive comparison function. Like longmulti-byte strings in UTF-8 encoding.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Re: [HACKERS] Hash index todo list item

Reply via email to