Greg Stark <st...@mit.edu> writes: > On Thu, Feb 6, 2014 at 11:48 PM, Andres Freund <and...@2ndquadrant.com> wrote: >> That's not necessarily true. If e.g. the buffer mapping would change >> racily, the result write from the bgwriter could very well end up >> increasing the file size, leaving a hole inbetween its write and the >> original size.
> a) the segment isn't sparse and b) there were whole segments full of > nuls between the end of the tables and the final blocks. > So the file was definitely extended by Postgres, not the OS and the > bgwriter passes EXTENSION_FAIL which means it wouldn't create those > intervening segments. But ... when InRecovery, md.c will create such segments too. We had dismissed that on the grounds that the files would be sparse because of the way md.c creates them. However, it is real damn hard to see how the loop in XLogReadBufferExtended could've accessed a bogus block, other than hardware misfeasance which I don't believe any more than you do. The blkno that's passed to that function came directly out of a WAL record that's in the private memory of the startup process and recently passed a CRC check. You'd have to believe some sort of asynchronous memory clobber inside the startup process. On the other hand, if _mdfd_getseg did the deed, there's a whole lot more space for something funny to have happened, because now we're talking about a buffer being written in preparation for eviction from shared buffers, long after WAL replay filled it. So I'm wondering if there's something wrong with our deduction from file non-sparseness. In this connection, google quickly found me a report of XFS "losing" the sparse state of a file across multiple writes: http://oss.sgi.com/archives/xfs/2011-06/msg00225.html I wonder whether that bug or a similar one exists in your production kernel. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers