I noticed recently that the SSDs hosting the ZIL for my pool had a large
number in the SMART attribute for total LBAs written (with some
calculation, it seems to be the total amount of data written to the pool so
far), did some testing, and found that the ZIL is being used quite heavily
(matching the writing speed) on writes that should be asynchronous.  I did
a capture with wireshark during a simple copy of a large file (8GB), and
both the write packets and the write responses showed the "UNSTABLE" I
would expect in asynchronous writes.  The copy didn't finish before the
server started writing heavily to the ZIL, so I wouldn't expect it to tell
NFS to commit then.  Here are some relevant pieces of info (with hostnames,
etc removed):

client: ubuntu 11.10
/etc/fstab entry: <server>:/mainpool/storage       /mnt/myelin     nfs
bg,retry=5,soft,proto=tcp,intr,nfsvers=3,noatime,nodiratime,async       0
    0

server: OpenIndiana oi_151a4
$ zfs get sync mainpool
NAME      PROPERTY  VALUE     SOURCE
mainpool  sync      standard  default
$ zfs get sync mainpool/storage
NAME              PROPERTY  VALUE     SOURCE
mainpool/storage  sync      standard  default
$ zfs get sharenfs mainpool/storage
NAME              PROPERTY  VALUE           SOURCE
mainpool/storage  sharenfs  rw=@xxx.xxx.37  local
$ zpool get version mainpool
NAME      PROPERTY  VALUE    SOURCE
mainpool  version   28       default

The pool consists of 24 sata disks arranged as 2 raid-z2 groups of 12, and
originally had a mirrored log across 10GB slices on two intel 320 80GB
SSDs, but I have since rearranged the logs as non-mirrored (since a single
SSD doesn't quite keep up with gigabit network throughput, and some testing
convinced me that a pool survives a failing log device gracefully, as long
as there isn't a simultaneous crash).  I would like to switch them back to
a mirrored configuration, but without impacting the asynchronous
throughput, and obviously it would be nice to reduce the write load to them
so they live longer.  As far as I can tell, all nfs writes are being
written to the ZIL when many should be cached in memory and bypass the ZIL.

Any help appreciated,
Tim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to