On Mon, Jan 29, 2024 at 4:13 PM Nathan Bossart <nathandboss...@gmail.com> wrote: > On Mon, Jan 29, 2024 at 03:18:50PM -0500, Robert Haas wrote: > > I'm wondering if what we need to do is run pg_walsummary on both > > summary files in that case. If we just pick one or the other, how do > > we know which one to pick? > > Even if we do that, isn't it possible that none of the summaries will > include the change? Presently, we get the latest summarized LSN, make a > change, and then wait for the next summary file with a greater LSN than > what we saw before the change. But AFAICT there's no guarantee that means > the change has been summarized yet, although the chances of that happening > in a test are probably pretty small. > > Could we get the LSN before and after making the change and then inspect > all summaries that include that LSN range?
The trick here is that each WAL summary file covers one checkpoint cycle. The intent of the test is to load data into the table, checkpoint, see what summaries exist, then update a row, checkpoint again, and see what summaries now exist. We expect one new summary because there's been one new checkpoint. When I was thinking about this yesterday, I was imagining that we were somehow getting an extra checkpoint in some cases. But it looks like it's actually an off-by-one situation. In https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-29%2018%3A09%3A10 the new files that show up between "after insert" and "after new summary" are: 00000001000000000152FAE000000000015AAAC8.summary (LSN distance ~500k) 00000001000000000152F7A8000000000152FAE0.summary (LSN distance 824 bytes) The checkpoint after the inserts says: LOG: checkpoint complete: wrote 14 buffers (10.9%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.956 s, sync=0.929 s, total=3.059 s; sync files=39, longest=0.373 s, average=0.024 s; distance=491 kB, estimate=491 kB; lsn=0/15AAB20, redo lsn=0/15AAAC8 And the checkpoint after the single-row update says: LOG: checkpoint complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.648 s, sync=0.355 s, total=2.798 s; sync files=3, longest=0.348 s, average=0.119 s; distance=11 kB, estimate=443 kB; lsn=0/15AD770, redo lsn=0/15AD718 So both of the new WAL summary files that are appearing here are from checkpoints that happened before the single-row update. The larger file is the one covering the 400 inserts, and the smaller one is the checkpoint before that. Which means that the "Wait for a new summary to show up." code isn't actually waiting long enough, and then the whole thing goes haywire. The problem is, I think, that this code naively thinks it can just wait for summarized_lsn and everything will be fine ... but that assumes we were caught up when we first measured the summarized_lsn, and that need not be so, because it takes some short but non-zero amount of time for the summarizer to catch up with the WAL generated during initdb. I think the solution here is to find a better way to wait for the inserts to be summarized, one that actually does wait for that to happen. -- Robert Haas EDB: http://www.enterprisedb.com