Hi, On 2019-01-22 14:29:23 -0600, Kevin Grittner wrote: > On Tue, Jan 22, 2019 at 12:17 PM Andres Freund <and...@anarazel.de> wrote: > > > Unfortunately, unless something has changed recently, that patch is > > *not* sufficient to really solve the issue - we don't guarantee that > > there's always an fd preventing the necessary information from being > > evicted from memory: > > But we can't lose an FD without either closing it or suffering an > abrupt termination that would trigger a PANIC, can we? And close() > always calls fsync(). And I thought our "PANIC on fsync" patch paid > attention to close(). How do you see this happening???
close() doesn't trigger an fsync() in general (although it does on many NFS implementations), and doing so would be *terrible* for performance. Given that it's pretty clear how you can get all FDs closed, right? You just need sufficient open files that files get closed due to max_files_per_process, and you can run into the issue. A thousand open files is pretty easy to reach with forks, indexes, partitions etc, so this isn't particularly artifical. > > Note that we might still lose the error if the inode gets evicted from > > the cache before anything can reopen it, but that was the case before > > errseq_t was merged. At LSF/MM we had some discussion about keeping > > inodes with unreported writeback errors around in the cache for longer > > (possibly indefinitely), but that's really a separate problem" > > > > And that's entirely possibly in postgres. > > Is it possible for an inode to be evicted while there is an open FD > referencing it? No, but we don't guarantee that there's always an FD open, due to the > > The commit was dicussed on list too, btw... > > Can you point to a post explaining how the inode can be evicted? https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de is, I think, a good overview, with a bunch of links. As is the referenced lwn article [1] and the commit message you linked. Greetings, Andres Freund [1] https://lwn.net/Articles/752063/