Hi Matthias, Thanks for the response and detailed explanation.
> Could you expand on this "treated as" a bit more? Do you mean that > once the horizon has passed, the next time maintenance comes around > this page will be deleted like a normal empty page would during > vacuum? Or is it immediately considered dead? Regarding your question about how the page is "treated as" a normal deletion: Because a BTP_MERGED_AWAY page is already unlinked from its parent, it has essentially already completed the first stage of deletion. Once the MergeXID horizon has safely passed, the page transitions to HALF_DEAD and is handled exactly as described in the nbtree README for the second stage of deletion: "In the second-stage, the half-dead leaf page is unlinked from its siblings. We first lock the left sibling (if any) of the target, the target page itself, and its right sibling (there must be one) in that order. Then we update the side-links in the siblings, and mark the target page deleted." To safely track this state transition, I need to store the MergeXID and the blkno of the BTP_MERGED_AWAY page. As you pointed out previously, adding these to the B-tree page header reduces available space and risks backward incompatibility with max tuple sizes. Given the constraints you mentioned, is modifying the header completely off the table, or could we safely introduce this through a new index version? Also, I wanted to share my current implementation for forward scans that were positioned between L and R before the merge. Since the forward scan already read L, here is how I handle it: When the scan encounters the BTP_MERGED page (R), it calls _bt_readpage. After unlocking R, but before returning, it steps back to read L (BTP_MERGED_AWAY). It saves L's tuples in a list inside BTScanOpaqueData, compares them against the data just read from R (so->currPos.items), and removes any duplicates. Best regards, Salma El-Sayed
