At Fri, 6 Aug 2021 02:34:24 +0000, "Bossart, Nathan" <bossa...@amazon.com> wrote in > On 8/5/21, 6:26 PM, "Kyotaro Horiguchi" <horikyota....@gmail.com> wrote: > > It works the current way always at the first iteration of > > pgarch_ArchiveCopyLoop() becuse in the last iteration of > > pgarch_ArchiveCopyLoop(), pgarch_readyXlog() erases the last > > anticipated segment. The shortcut works only when > > pgarch_ArchiveCopyLoop archives more than once successive segments at > > once. If the anticipated next segment found to be missing a .ready > > file while archiving multiple files, pgarch_readyXLog falls back to > > the regular way. > > > > So I don't see the danger to happen perhaps you are considering. > > I think my concern is that there's no guarantee that we will ever do > another directory scan. A server that's generating a lot of WAL could > theoretically keep us in the next-anticipated-log code path > indefinitely.
Theoretically possible. Supposing that .ready may be created out-of-order (for the following reason, as a possibility), when once the fast path bailed out then the fallback path finds that the second oldest file has .ready, the succeeding fast path continues running leaving the oldest file. > > In the first place, .ready are added while holding WALWriteLock in > > XLogWrite, and while removing old segments after a checkpoint (which > > happens while recovery). Assuming that no one manually remove .ready > > files on an active server, the former is the sole place doing that. So > > I don't see a chance that .ready files are created out-of-order way. > > Perhaps a more convincing example is when XLogArchiveNotify() fails. > AFAICT this can fail without ERROR-ing, in which case the server can > continue writing WAL and creating .ready files for later segments. At > some point, the checkpointer process will call RemoveOldXlogFiles() > and try to create the missing .ready file. Mmm. Assuming that could happen, a history file gets cursed to lose a chance to be archived forever once that disaster falls onto it. Apart from this patch, maybe we need a measure to notify the history files that are once missed a chance. Assuming that all such forgotten files would be finally re-marked as .ready anywhere, they can be re-found by archiver by explicitly triggering the fallback path. Currently the trigger fires implicitly by checking shared timeline movement, but by causing the trigger by, for example by a signal as mentioned in a nearby message, that behavior would be easily to implement. regards. -- Kyotaro Horiguchi NTT Open Source Software Center