Shreya Bhargava wrote:
1. Populate the table with 80 million tuples.
2. Create HASH index on the table.
3. clear both linux cache & psql buffers.
   (exiting psql and restarting it cleared the psql buffers;
    to clear linux cache, we used drop_cache command)
4. start psql
5. select on an integer in the range of values in the table.
   (all test numbers were big ones, like 98934599)
6. record the time.
7. exit psql.
8. drop caches.(as described above)
9. repeat 4-8 for different numbers.
10. Drop Hash index.
11. Create Btree index and repeat 3-9.

It seems you're mostly measuring the overhead of starting a backend, populating the relcache etc.

Restarting psql doesn't clear the postgres shared buffer cache. Or did you mean that you restarted postgres?

Anyway, I don't think it's interesting to test with cleared caches. Surely the metapage and first 1-2 levels of the b-tree would stay cached all the time in real life.

From the results obtained, the average of all the hash probes is 141.8ms, the 
average for btree is 168.5, a difference of about 27.The standard deviations 
are about 23, so this is a statistically significant difference.

I don't trust those numbers much, but in any case I don't think that edge is big enough to justify the existence of hash indexes.

If you're looking for a use case where hash index is faster, I'd suggest using a data type with an expensive comparison function. Like long multi-byte strings in UTF-8 encoding.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to