On Wed, Jun 10, 2026 at 5:16 PM Nikolay Samokhvalov <[email protected]> wrote: > On Tue, Jun 9, 2026 at 6:31 AM Amit Langote <[email protected]> wrote: >> On Mon, Jun 8, 2026 at 5:18 PM Amit Langote <[email protected]> wrote: >> > On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <[email protected]> >> > wrote: >> > > Thanks for the detailed report and reproducers. I’ve started looking >> > > into this. >> > >> > Continuing to look. Appended this to the open items list: >> > >> > https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues >> >> Thanks again, Nik, for the thorough analysis and the reproducers -- >> they made all three easy to confirm and pin down. Patches attached: >> 0001 for defect 1, 0002 for defects 2 and 3. >> >> 0001 (defect 1): check and flush before writing the row rather than >> after, and add a per-entry "flushing" flag so a re-entrant add on the >> same entry during a flush takes the per-row path instead of touching >> the mid-flush batch. The flag is cleared in a PG_FINALLY, which also >> resets batch_count, so the entry stays reusable if a flush error is >> caught by a savepoint. >> >> 0002 (defects 2 and 3): rather than track subxact membership per row, >> confine batching to the top transaction level -- in RI_FKey_check, >> when GetCurrentTransactionNestLevel() > 1, use the per-row path. I >> went this way because per-entry subxact tracking isn't enough (one >> entry's batch can mix rows from several levels, since the cache is >> keyed by constraint), and flushing at subxact boundaries doesn't work >> for deferred constraints. Once the cache only ever holds top-level >> rows, a subxact abort has nothing of its own to discard, so >> ri_FastPathSubXactCallback goes away -- that's what fixes your defect >> 2 reproducer. For defect 3, which is still reachable at the top level, >> the same patch adds a cache-wide flag set while ri_FastPathEndBatch >> iterates, so a re-entrant check during the scan takes the per-row path >> instead of inserting into the cache being scanned. >> >> The per-row path still bypasses SPI, so these stay well ahead of the >> pre-19 check in terms of performance. I'd like to recover batching >> across subtransactions properly in v20 but didn't want to rush it now. >> >> On defect 3, can you check whether your reproducer still commits the >> orphan with 0002 applied, or whether (like on my build) it now raises >> the violation? I'd like to be sure the bucket-placement variation you >> hit is actually covered. And of course any review of the patches is >> welcome. > > Hi Amit, > > Thanks for the quick fixes. > > I checked v1-0001 + v1-0002 against current master (e18b0cb7) with an > assertion/debug build. > > - Both apply cleanly to master (in sequence) > - Defect 1 same-FK re-entry no longer crashes; the original shape completes > and leaves the expected rows > - Defect 2 subtransaction-abort case now raises the FK violation instead of > committing orphans > - For your defect 3 question: with 0002 applied, my reproducer no longer > commits the child2 orphan. It raises: > ERROR: insert or update on table "child2" violates foreign key constraint > "child2_fkey" > DETAIL: Key (a)=(999999) is not present in table "parent". > > After the error, child2_orphans = 0 and child2 is empty in my run. > > I also ran the regression suite in that tree; foreign_key passed, and the > full run reported all 245 tests passed. > > So v1 looks good to me for the three reported cases.
Thanks for checking. I will review them a bit more closely before committing by Friday. Other reviews are welcome. -- Thanks, Amit Langote
