On Mon, May 6, 2019 at 4:11 PM Peter Geoghegan <p...@bowt.ie> wrote: > I am tempted to move the call to _bt_truncate() out of _bt_split() > entirely on HEAD, possibly relocating it to > nbtsplitloc.c/_bt_findsplitloc(). That way, there is a clearer > separation between how split points are chosen, suffix truncation, and > the mechanical process of executing a legal page split.
I decided against that -- better to make it clear how truncation deals with space overhead within _bt_split(). Besides, the resource management around sharing a maybe-palloc()'d high key across module boundaries seems complicated, and best avoided. Attached draft patch for HEAD fixes the bug by organizing _bt_split() into clear phases. _bt_split() now works as follows, which is a little different: * An initial phase that is entirely concerned with the left page temp buffer itself -- initializes its special area. * Suffix truncation to get left page's new high key, and then add it to left page. * A phase that is mostly concerned with initializing the right page special area, but also finishes off one or two details about the left page that needed to be delayed. This is also where the "shadow critical section" begins. Note also that this is where _bt_vacuum_cycleid() is called, because its contract actually *requires* that caller has a buffer lock on both pages at once. This should not be changed on the grounds that _bt_vacuum_cycleid() might fail (nor for any other reason). * Add new high key to right page if needed. (No change, other than the fact that it happens later now.) * Add other items to both leftpage and rightpage. Critical section that copies leftpage into origpage buffer. (No changes here.) I suppose I'm biased, but I prefer the new approach anyway. Adding the left high key first, and then the right high key seems simpler and more logical. It emphasizes the similarities and differences between leftpage and rightpage. Furthermore, this approach fixes the theoretical risk of leaving behind a minimally-initialized nbtree page that has existed since 2010. We don't allocated *any* memory after the point that a new rightpage buffer is acquired. I suppose that this will need to be backpatched. Thoughts? -- Peter Geoghegan
v1-0001-Don-t-leave-behind-junk-nbtree-pages-during-split.patch
Description: Binary data