On 13/03/2019 03:28, Peter Geoghegan wrote:
On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:
I made a copy of the _bt_binsrch, _bt_binsrch_insert. It does the binary
search like _bt_binsrch does, but the bounds caching is only done in
_bt_binsrch_insert. Seems more clear to have separate functions for them
now, even though there's some duplication.

/*
   * Do the insertion. First move right to find the correct page to
   * insert to, if necessary. If we're inserting to a non-unique index,
   * _bt_search() already did this when it checked if a move to the
   * right was required for leaf page.  Insertion scankey's scantid
   * would have been filled out at the time. On a unique index, the
   * current buffer is the first buffer containing duplicates, however,
   * so we may need to move right to the correct location for this
   * tuple.
   */
if (checkingunique || itup_key->heapkeyspace)
         _bt_findinsertpage(rel, &insertstate, stack, heapRel);

newitemoff = _bt_binsrch_insert(rel, &insertstate);

The attached new version simplifies this, IMHO. The bounds and the
current buffer go together in the same struct, so it's easier to keep
track whether the bounds are valid or not.

Now that you have a full understanding of how the negative infinity
sentinel values work, and how page deletion's leaf page search and
!heapkeyspace indexes need to be considered, I think that we should
come back to this _bt_binsrch()/_bt_findsplitloc() stuff. My sense is
that you now have a full understanding of all the subtleties of the
patch, including those that that affect unique index insertion. That
will make it much easier to talk about these unresolved questions.

My current sense is that it isn't useful to store the current buffer
alongside the binary search bounds/hint. It'll hardly ever need to be
invalidated, because we'll hardly ever have to move right within
_bt_findsplitloc() when doing unique index insertion (as I said
before, the regression tests *never* have to do this according to
gcov).

It doesn't matter how often it happens, the code still needs to deal with it. So let's try to make it as readable as possible.

We're talking about a very specific set of conditions here, so
I'd like something that's lightweight and specialized. I agree that
the savebinsrch/restorebinsrch fields are a bit ugly, though. I can't
think of anything that's better offhand. Perhaps you can suggest
something that is both lightweight, and an improvement on
savebinsrch/restorebinsrch.

Well, IMHO holding the buffer and the bounds in the new struct is more clean than the savebinsrc/restorebinsrch flags. That's exactly why I suggested it. I don't know what else to suggest. I haven't done any benchmarking, but I doubt there's any measurable difference.

I'm of the opinion that having a separate _bt_binsrch_insert() does
not make anything clearer. Actually, I think that saving the bounds
within the original _bt_binsrch() makes the design of that function
clearer, not less clear. It's all quite confusing at the moment, given
the rightmost/!leaf/page empty special cases. Seeing how the bounds
are reused (or not reused) outside of _bt_binsrch() helps with that.

Ok. I think having some code duplication is better than one function that tries to do many things, but I'm not wedded to that.

- Heikki

Reply via email to