Re: [zfs-discuss] Re: [dtrace-discuss] Re: [nfs-discuss] Script to trace NFSv3 client operations

David Weibel Wed, 10 May 2006 15:29:35 -0700

 I have a test configuration up and running internally.  I'm
not sure I'm seeing the exact same issues.  For this testing
I'm using GRITS.  I don't know any really great FS performance
tools.  I tend to do my performance testing with vdbench with
raw SCSI IO on purpose to avoid FS caches.  (I'm more focused
on iSCSI performance in the Solaris initiator.)


configuration:
   ultra 20(amd64) - gige connection via nge
   solaris nevada build 40 - nondebug

   Cisco48portGigESwitch
     +-> ultra20 (amd64) host
     +-> w2100z (2x amd64) host
     +-> equal logic iscsi array (3 x 20G LUNs mapped to ultra20)
     |
     +-> lots of other unrelated hosts, arrays, etc...

   3 x 20G iSCSI LUNs -> RAW IO <==> vdbench
      pref: ~100 MB/s arg.

   3 x 20G iSCSI LUNs -> UFS <==> localhost GRITS
      perf: ~76.9 MB/s avg.

   3 x 20G iSCSI LUNs -> NFS <==> w2100z(2x amd64) GRITS
      pref: ~34 MB/s avg.

   3 x 20G iSCSI LUNs -> ZFS raidz <==> localhost GRITS
      pref: ~90 MB/s avg. (more bursty than ufs
                             29MB/s to 36MB/s)

   3 x 20G iSCSI LUNs -> ZFS raidz -> NFS <==>
                                      w2100(2x amd64) GRITS
      pref: ~15 MB/s avg.

GRITS creates 10 directories, write 10 files in each directory,
reads back the files, verifies their contents, deletes the files
and directories and then repeats the process.  This tool is java
based and designed more for file system data verification testing
not performance.

VDBENCH used for performance testing and benchmarking.

I will hand off my configuration for the Sun NFS teams internally
to check out.

-David

Joe Little wrote:

Well, I tried some suggested iscsi tunings to no avail. I did trysomething else though: I brought up samba. My linux 2.2 source treecopying into the ZFS volume (in other words, SMB->ZFS->iSCSI) did farmuch better, taking a minute to copy 102MB. And that's from a100MB/sec client.

My original tests of around 2TB of mixed files (mostly) small, took 51minutes in the NFS case, but took 3.5 minutes exactly in the SMB case(again, from a 100MB/sec client). I averaged almost 9MB/sec on thattransfer. These are not stellar numbers, but they are far better thanthe low k/sec that I see with NFS. I definitely think the bug is onthe NFS server end, even considering that the SMB protocol is different.

On 5/8/06, *Joe Little* <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:


    I was asked to also snoop the iscsi end of things, trying to find
    something different between the two. iscsi being relatively opaque, it
    was easiest to find differences in the patterns. In the local copy to
    RAIDZ example, the iscsi link would show packets of 1514 in length in
    series of 5-10, with interjecting packets of 60 or 102, generally 2-4
    in number. In the NFS client hitting the RAIDZ/iscsi combo, the iscsi
    length would have 3-5 on average 1514 length packets with 5-7 packets
    of 60 or 102 in between. Basically, the averages swapped, and its
    likely because of a lot more meta data and/or write confirmations
    going on in the NFS case.

    At this point in time, I have two very important questions:

    1) Is there any options available or planned to make NFS/ZFS work more
    in concert to avoid this overhead, which with many small iscsi packets
    (in the iscsi case) kills performance?

    2) Is iscsi-backed storage, especially StorageTek acquired products,
    in the planning matrix for supported ZFS (NAS) solutions? Also, why
    hasn't this combination been tested to date since this appears to be
    an achilles heal. Again, UFS does not have this problem, nor other
    file systems on other OSes (namely, XFS, JFS, etc which I've tested
    before)


    On 5/8/06, Nicolas Williams <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:
    > On Fri, May 05, 2006 at 11:55:17PM -0500, Spencer Shepler wrote:
    > > On Fri, Joe Little wrote:
    > > > Thanks. I'm playing with it now, trying to get the most
    succinct test.
    > > > This is one thing that bothers me: Regardless of the backend, it
    > > > appears that a delete of a large tree (say the linux kernel)
    over NFS
    > > > takes forever, but its immediate when doing so locally. Is
    delete over
    > > > NFS really take such a different code path?
    > >
    > > Yes.  As mentioned in my other email, the NFS protocol requires
    > > that operations like REMOVE, RMDIR, CREATE have the filesystem
    > > metadata written to stable storage/disk before sending a response
    > > to the client.  That is not required of local access and
    therefore
    > > the disparity between the two.
    >
    > So then multi-threading rm/rmdir on the client-side would help, no?
    >
    > Are there/should there be async versions of creat(2)/mkdir(2)/
    > rmdir(2)/link(2)/unlink(2)/...?
    >
    > Nico
    > --
    >



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: [dtrace-discuss] Re: [nfs-discuss] Script to trace NFSv3 client operations

Reply via email to