On Tue, May 14, 2019 at 01:59:10PM +0900, Kyotaro HORIGUCHI wrote: > At Sun, 12 May 2019 17:37:05 -0700, Noah Misch <n...@leadboat.com> wrote in > <20190513003705.ga1202...@rfd.leadboat.com> > > On Sun, Mar 31, 2019 at 03:31:58PM -0700, Noah Misch wrote: > > > On Sun, Mar 10, 2019 at 07:27:08PM -0700, Noah Misch wrote: > > > > I also liked the design in the > > > > https://postgr.es/m/559fa0ba.3080...@iki.fi > > > > last paragraph, and I suspect it would have been no harder to > > > > back-patch. I > > > > wonder if it would have been simpler and better, but I'm not asking > > > > anyone to > > > > investigate that. > > > > > > Now I am asking for that. Would anyone like to try implementing that > > > other > > > design, to see how much simpler it would be? > > Yeah, I think it is a bit too-complex for the value. But I think > it is the best way as far as we keep reusing a file on > truncation of the whole file.
The design of v11-0006-Fix-WAL-skipping-feature.patch doesn't, in general, work for WAL records touching more than one buffer. For heapam, that patch works around this problem by emitting XLOG_HEAP_INSERT or XLOG_HEAP_DELETE when we'd normally emit XLOG_HEAP_UPDATE. As a result, post-crash-recovery heap page bits differ from the bits present when we don't crash. Though I'm 85% confident this does not introduce a bug today, this is fragile. That is the main complexity I wish to avoid. I suspect the design in the https://postgr.es/m/559fa0ba.3080...@iki.fi last paragraph will be simpler, not more complex. In the implementation I'm envisioning, smgrDoPendingDeletes() would change name, perhaps to AtEOXact_Storage(). For every relfilenode it does not delete, it would ensure durability by syncing (for large nodes) or by WAL-logging each page (for small nodes). RelationNeedsWAL() would return false whenever the applicable relfilenode appears in pendingDeletes. Access methods would remove their smgrimmedsync() calls, but they would otherwise not change. Would anyone like to try implementing that?