Dear PostgreSQL Hackers:

After following the hackers mailing list for quite a while,
I am going to start investigating what will need to be done
to improve hash index performance. Below are the pieces of
this project that I am currently considering:

1. Characterize the current hash index implementation against
   the BTree index, with a focus on space utilization and
   lookup performance against a collection of test data. This
   will give a baseline performance test to evaluate the impact
   of changes. I initially do not plan to bench the hash creation
   process since my initial focus will be on lookup performance.

2. Evaluate the performance of different hash index implementations
   and/or changes to the current implementation. My current plan is
   to keep the implementation as simple as possible and still provide
   the desired performance. Several hash index suggestions deal with
   changing the layout of the keys on a page to improve lookup
   performance, including reducing the bucket size to a fraction of
   a page or only storing the hash value on the page, instead of
   the index value itself. My goal in this phase is to produce one
   or more versions with better performance than the current BTree.
   
3. Look at build time and concurrency issues with the addition of
   some additional tests to the test bed. (1)

4. Repeat as needed.

This is the rough plan. Does anyone see anything critical that
is missing at this point? Please send me any suggestions for test
data and various performance test ideas, since I will be working
on that first.

Regards,
Ken Marshall 

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to