On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer <cr...@2ndquadrant.com> wrote: > On 4 April 2018 at 13:29, Thomas Munro <thomas.mu...@enterprisedb.com> > wrote: >> /* Ensure that we skip any errors that predate opening of the file */ >> f->f_wb_err = filemap_sample_wb_err(f->f_mapping); >> >> [...] > > Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel > will deliberately hide writeback errors that predate our fsync() call from > us?
Predates the opening of the file by the process that calls fsync(). Yeah, it sure looks that way based on the above code fragment. Does anyone know better? > Does that mean that the ONLY ways to do reliable I/O are: > > - single-process, single-file-descriptor write() then fsync(); on failure, > retry all work since last successful fsync() I suppose you could some up with some crazy complicated IPC scheme to make sure that the checkpointer always has an fd older than any writes to be flushed, with some fallback strategy for when it can't take any more fds. I haven't got any good ideas right now. > - direct I/O As a bit of an aside, I gather that when you resize files (think truncating/extending relation files) you still need to call fsync() even if you read/write all data with O_DIRECT, to make it flush the filesystem meta-data. I have no idea if that could also be affected by eaten writeback errors. -- Thomas Munro http://www.enterprisedb.com