On 6/24/06, Neil Perrin <[EMAIL PROTECTED]> wrote:

The data will be written twice on ZFS using NFS. This is because NFS
on closing the file internally uses fsync to cause the writes to be
committed. This causes the ZIL to immediately write the data to the intent log.
Later the data is also written committed as part of the pools transaction group
commit, at which point the intent block blocks are freed.

In this case though, the file is left open, so there should be no synchronous
I/O.  (tcpdump -vv confirms that all writes are marked as unstable, and there
are no commits.)   Perhaps the NFS server is issuing the I/O synchronously
when it should not be, thus causing the double writes?

It does seem inefficient to doubly write the data. In fact for blocks
larger than zfs_immediate_write_sz (was 64K but now 32K after 6440499 fixed)
we write the data block and also an intent log record with the block pointer.
During txg commit we link this block into the pool tree. By experimentation
we found 32K to be the (current) cutoff point. As the nfsd at most write 32K
they do not benefit from this.

That seems like an interesting coincidence, though perhaps it may not be the
same issue after all.  While the disparity does not afflict UFS, maybe it has
merely gone unnoticed, due to the nature of the logging.  It seems possible that
it is entirely an NFS server issue.

Anyway this is an area we are actively working on.

This is good to know; the synchronous I/O performance is kind of painful at the
moment.  Not that it would be a problem, but disk image based filesystems in
MacOS X appear to do *all* I/O to their backing store synchronously; as you
might imagine, the performance is spectacularly bad over NFS!

Anyway, thanks Neil.  I appologize if this is in fact unrelated to ZFS.

Chris
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to