On Wed, Feb 3, 2021 at 05:00:19PM -0800, Andres Freund wrote: > Hi, > > On 2021-02-03 19:21:25 -0500, Bruce Momjian wrote: > > On Wed, Feb 3, 2021 at 03:29:13PM -0800, Andres Freund wrote: > > > Changing this is *completely* infeasible. In a lot of workloads it'd > > > cause a *massive* explosion of WAL volume. Like quadratically. You'll > > > need to find another way to generate a nonce. > > > > Do we often do multiple writes to the file system of the same page > > during a single checkpoint, particularly only-hint-bit-modified pages? > > I didn't think so. > > It can easily happen. Consider ringbuffer using scans (like vacuum, > seqscan) - they'll force the buffer out to disk soon after it's been > dirtied. And often will read the same page again a short bit later. Or > just any workload that's a bit bigger than shared buffers (but data is > in the OS cache). Subsequent scans will often have new hint bits to > set.
Oh, good point. > > Is the logical approach here to modify XLogSaveBufferForHint() so if a > > page write is not needed, to create a dummy WAL record that just > > increments the WAL location and updates the page LSN? > > (Is there a small WAL record I should reuse?) > > I think an explicit record type would be better. Or a hint record > without an associated FPW. OK. > > I can try to add a hint-bit-page-write page counter, but that might > > overflow, and then we will need a way to change the LSN anyway. > > That's just a question of width... Yeah, the hint bit counter is just delaying the inevitasble, plus it changes the page format, which I am trying to avoid. Also, I need this dummy record only if the page is marked clean, meaning a write to the file system already happened in the current checkpoint --- should not be to bad. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee