On Tue, Nov 14, 2023 at 8:12 AM Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: > 0001 looks OK to push, and since it stands on its own I would get it out > of the way soon rather than waiting for the rest of the series to be > further reviewed.
All right, done. > 0003: > AmWalSummarizerProcess() is unused. Remove? The intent seems to be to have one of these per enum value, whether it gets used or not. Some of the others aren't used, either. > MaybeWalSummarizer() is called on each ServerLoop() in postmaster.c? > This causes a function call to be emitted every time through. That > looks odd. All other process starts have some triggering condition. I'm not sure how much this matters, really. I would expect that the function call overhead here wouldn't be very noticeable. Generally I think that when ServerLoop returns from WaitEventSetWait it's going to be because we need to fork a process. That's pretty expensive compared to a function call. If we can iterate through this loop lots of times without doing any real work then it might matter, but I feel like that's probably not the case, and probably something we would want to fix if it were the case. Now, I could nevertheless move some of the triggering conditions in MaybeStartWalSummarizer(), but moving, say, just the summarize_wal condition wouldn't be enough to avoid having MaybeStartWalSummarizer() called repeatedly when there was no work to do, because summarize_wal could be true and the summarizer could all be running. Similarly, if I move just the WalSummarizerPID == 0 condition, the function gets called repeatedly without doing anything when summarize_wal = off. So at a minimum you have to move both of those if you care about avoiding the function call overhead, and then you have to wonder if you care about the corner cases where the function would be called repeatedly for no gain even then. Another approach would be to make the function static inline rather than just static. Or we could delete the whole function and just duplicate the logic it contains at both call sites. Personally I'm inclined to just leave it how it is in the absence of some evidence that there's a real problem here. It's nice to have all the triggering conditions in one place with nothing duplicated. > GetOldestUnsummarizedLSN uses while(true), but WaitForWalSummarization > and SummarizeWAL use while(1). Maybe settle on one style? OK. > 0004: > in PrepareForIncrementalBackup(), the logic that determines > earliest_wal_range_tli and latest_wal_range_tli looks pretty weird. I > think it works fine if there's a single timeline, but not otherwise. > Or maybe the trick is that it relies on timelines returned by > readTimeLineHistory being sorted backwards? If so, maybe add a comment > about that somewhere; I don't think other callers of readTimeLineHistory > make that assumption. It does indeed rely on that assumption, and the comment at the top of the for (i = 0; i < num_wal_ranges; ++i) loop explains that. Note also the comment just below that begins "If we found this TLI in the server's history". I agree with you that this logic looks strange, and it's possible that there's some better way to do encode the idea than what I've done here, but I think it might be just that the particular calculation we're trying to do here is strange. It's almost easier to understand the logic if you start by reading the sanity checks ("manifest requires WAL from initial timeline %u starting at %X/%X, but that timeline begins at %X/%X" et. al.), look at the triggering conditions for those, and then work upward to see how earliest/latest_wal_range_tli get set, and then look up from there to see how saw_earliest/latest_wal_range_tli are used in computing those values. We do rely on the ordering assumption elsewhere. For example, in XLogFileReadAnyTLI, see if (tli < curFileTLI) break. We also use it to set expectedTLEs, which is documented to have this property. And AddWALInfoToBackupManifest relies on it too; see the comment "Because the timeline history file lists newer timelines before older ones" in AddWALInfoToBackupManifest. We're not entirely consistent about this, e.g., unlike XLogFileReadAnyTLI, tliInHistory() and tliOfPointInHistory() don't have an early exit provision, but we do use it some places. -- Robert Haas EDB: http://www.enterprisedb.com