Re: [HACKERS] Hash index todo list item

Tom Raney Sun, 21 Oct 2007 13:17:56 -0700

Kenneth, I just pushed the revised patch (v2!). The revised approachsamples the parent relation to estimate the number of tuples rather thanperforming a complete scan. In my tests, the estimate appears to beaccurate, erring on the larger side, which is fine.

Tom,


That is great. I am looking forward to your patch. After the
issues that you needed to address, I think that it would be
reasonable to add a few more user settings for the hash index.
Fill-factor is too course a knob. The others that I have been
considering are:

maxtuples - Not really the maximum, but a target value to use
  for setting up the initial buckets. This would allow you to
  set it for data loads and avoid the "split-n-copy" trauma
  that you are trying to avoid with your new hash build process.

If I understand you correctly, I believe we already do this with ourcurrent build process, there should not be any splits of the index if weestimated the tuple count correctly. However, what gets you iscollisions where lots of overflow pages occur when distinct keys map tothe same bucket, or if you have lots of duplicate keys. Because youroverall tuple count doesn't exceed the fill factor, no splits occur, butlengthy bucket chains lead to lots of IOs. You touch on this below.

multiplicity - Try to capture use cases that would require many
  overflow pages. In particular, if we discard the normal index
  page layout we can skip the space overhead of the page pointer
  and generate a more compact index. Then you could use a few
  more hash bits to lookup the index entry in the page. How many
  bits would be determined by this factor. 8-bits would give
  you 256 sub-pieces that could each hold about 3 entries using
  the current 4-byte hash, or 2 entries using an 8-byte hash.

What do you think?

Yes, this is a good direction. If we can increase the number of bucketsand reduce the bucket size (either physically or virtually) to allowmore direct access without creating a huge index on disk, that would beideal. But, then if you do have collisions, overflows occur morefrequently. I spoke with Neil Conway yesterday at the PG conferencehere in Portland and he piqued my interest in examining his hash codemore closely to see what he has already done in this area.


-Tom

Cheers,
Ken

On Wed, Oct 17, 2007 at 03:31:58PM -0700, Tom Raney wrote:
Kenneth,

Great!
Yes, we did update the code to use the estimate. I will post the patchwith this update. We only saw a very small difference in index build time,but you may when you add many columns to the base relation.With 1 billion tuples, you should start to see the hash index outperformthe btree for some equality probes, I would imagine. With a 90% fillfactor, the btree would require 4 levels to index that many tuples. If thetop two were in memory, there would be 3 IOs needed. I don't think PGsupports index only scans, so it will take the extra IO to probe the baserelation. The hash may take up to 2 IOs and maybe even less (or maybe moredepending on how many overflow buckets there are). It might be interestingto fiddle with the fill factors of each index - hash pages (buckets)default to 75% full.-Tom
Tom,

I am getting ready to stitch in an updated, simplified version
of Neil Conway's hash-only hash index patch. Did you have a
chance to update your sizing function to use the planner-like
estimate and not a full table scan? I would like to be able
to use that when my test table start to have 10^9 entries.
If you have not had a chance, I will try and add it myself.

Regards,
Ken
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Hash index todo list item

Reply via email to