Commit 8fa30f906be reduced the elevel of a number of "can't happen" errors from PANIC to ERROR. These were all critical-section-adjacent errors involved in nbtree page splits, and nbtree page deletion. It also established the following convention within _bt_split(), which allowed Tom to keep the length of the critical section just as short as it had always been:
/* * origpage is the original page to be split. leftpage is a temporary * buffer that receives the left-sibling data, which will be copied back * into origpage on success. rightpage is the new page that receives the * right-sibling data. If we fail before reaching the critical section, * origpage hasn't been modified and leftpage is only workspace. In * principle we shouldn't need to worry about rightpage either, because it * hasn't been linked into the btree page structure; but to avoid leaving * possibly-confusing junk behind, we are careful to rewrite rightpage as * zeroes before throwing any error. */ The INCLUDE indexes work looks like it subtly broke this, because it allocated memory after the initialization of the right page -- allocating memory can always fail. On the other hand, even when 8fa30f906be went in back in 2010 this "rule" was arguably broken, because we were already calling PageGetTempPage() after the right sibling page is initialized, which palloc()s a full BLCKSZ, which is far more than truncation is every likely to allocate. On the other other hand, it seems to me that the PageGetTempPage() thing might have been okay, because it happens before the high key is inserted on the new right buffer page. The same cannot be said for the way we generate a new high key for the left/old page via suffix truncation, which happens to occur after the right buffer page is first modified by inserted its high key (the original/left page's original high key). I think that there may be a risk that VACUUM's page deletion code will get confused by finding an errant right sibling page from a failed page split when there is a high key. If so, that would be a risk that was introduced in Postgres 11, and made much more likely in practice in Postgres 12. (I haven't got as far as doing an analysis of the risks to page deletion, though. The "fastpath" rightmost page insertion optimization that was also added to Postgres 11 seems like it also might need to be considered here.) -- Peter Geoghegan