On Thu, Apr 21, 2011 at 1:26 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > Daniel Farina points out to me that the Linux man page for fsync() says > "Calling fsync() does not necessarily ensure that the entry in the directory > containing the file has also reached disk. For that an > explicit fsync() on a > file descriptor for the directory is also needed." > http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html
I'd also like to point out that even on ext(2|3) there is a special option, 'dirsync', and directory attribute (see 'chattr') that exists, mostly to the benefit of the authors of MTAs that use a lot of metadata manipulation operations, to allow all directory metadata mangling to be synchronous, to get around non-durable metadata manipulations (even if you use fsync() a crash between the rename() and the fsync() will leave you in either the pre-move or post-move state: it is atomic, and non-durable, the synchronous directory modification ensures that the return of rename() coincides with the durability of the rename itself, or so I would think. I only found this from doing some research about how perform a two-phase commit between postgres and the file system and reading the kernel source. I admit, it's a dusty and obscure corner, but it still seems in use by said MTAs. Would a reading and exploration of the kernel code at hand perhaps help resolve this discussion, one way or another? -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers