On 02/14/2010 03:24 PM, Florian Weimer wrote:
* Tom Lane:
Which options would that be? I am not aware that there any for any of the
recent linux filesystems.
Shouldn't journaling of metadata be sufficient?
You also need to enforce ordering between the directory update and the
file update.  The file metadata is flushed with fsync(), but the
directory isn't.  On some systems, all directory operations are
synchronous, but not on Linux.

       dirsync
All directory updates within the filesystem should be done syn- chronously. This affects the following system calls: creat,
              link, unlink, symlink, mkdir, rmdir, mknod and rename.

The widely reported problems, though, did not tend to be a problem with directory changes written too late - but directory changes being written too early. That is, the directory change is written to disk, but the file content is not. This is likely because of the "ordered journal" mode widely used in ext3/ext4 where metadata changes are journalled, but file pages are not journalled. Therefore, it is important for some operations, that the file pages are pushed to disk using fsync(file), before the metadata changes are journalled.

In theory there is some open hole where directory updates need to be synchronized with file updates, as POSIX doesn't enforce this ordering, and we can't trust that all file systems implicitly order things correctly, but in practice, I don't see this sort of problem happening.

If you are concerned, enable dirsync.

Cheers,
mark

--
Mark Mielke<m...@mielke.cc>


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to