On Thu, 1 Apr 2010, Edward Ned Harvey wrote:

Dude, don't be so arrogant.  Acting like you know what I'm talking about
better than I do.  Face it that you have something to learn here.

Geez!

Yes, all the transactions in a transaction group are either committed
entirely to disk, or not at all.  But they're not necessarily committed to
disk in the same order that the user level applications requested.  Meaning:
If I have an application that writes to disk in "sync" mode intentionally
... perhaps because my internal file format consistency would be corrupt if
I wrote out-of-order ... If the sysadmin has disabled ZIL, my "sync" write
will not block, and I will happily issue more write operations.  As long as
the OS remains operational, no problem.  The OS keeps the filesystem
consistent in RAM, and correctly manages all the open file handles.  But if
the OS dies for some reason, some of my later writes may have been committed
to disk while some of my earlier writes could be lost, which were still
being buffered in system RAM for a later transaction group.

The purpose of the ZIL is to act like a fast "log" for synchronous writes. It allows the system to quickly confirm a synchronous write request with the minimum amount of work. As you say, "OS keeps the filesystem consistent in RAM". There is no 1:1 ordering between application write requests and zfs writes and in fact, if the same portion of file is updated many times, or the file is created/deleted many times, zfs only writes the updated data which is current when the next TXG is written. For a synchronous write, zfs advances its index in the slog once the corresponding data has been committed in a TXG. In other words, the "sync" and "async" write paths are the same when it comes to writing final data to disk.

There is however the recovery case where synchronous writes were affirmed which were not yet written in a TXG and the system spontaneously reboots. In this case the synchronous writes will occur based on the slog, and uncommitted async writes will have been lost. Perhaps this is the case you are worried about.

It does seem like rollback to a snapshot does help here (to assure that sync & async data is consistent), but it certainly does not help any NFS clients. Only a broken application uses sync writes sometimes, and async writes at other times.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to