Hello all,

After setting up a Solaris 10 machine with ZFS as the new NFS server,
I'm stumped by some serious performance problems.  Here are the
(admittedly long) details (also noted at
http://www.netmeister.org/blog/):

The machine in question is a dual-amd64 box with 2GB RAM and two
broadcom gigabit NICs.  The OS is Solaris 10 6/06 and the filesystem
consists of a single zpool stripe across the two halfs of an Apple XRaid
(each half configured as RAID5), providing a pool of 5.4 TB.

On the pool, I've created a total of 60 filesystems, each of them shared
via NFS, each of them with compression turned on.

The clients (NetBSD) mount the filesystems with '-U -r32768 -w32768', and
initially everything looks just fine.  (The clients also do NIS against a
different server.)

Running 'ls' and 'ls -l' on a large directory looks just fine, upon first
inspection.  Reading the filesystem works fine, too:

(1) Running a 'find . -type f -print' on a directory with a total of 46450
files/subdirectories in it takes about 90 seconds, yielding an average I/O
size of 64 at around 2000 kr/s according to iostat(1M).

(2) Running a 'dd if=/dev/zero of=blah bs=1024k count=128' takes about 18
seconds at almost 7MB/s (this is a 10/100 network).  To compare how this
measures up when not doing any file I/O, I ran 'dd if=/dev/zero bs=1024k
count=128 | ssh remotehost "cat - >/dev/null"', which took about 13
seconds.

(3) Reading from the NFS share ('dd if=blah of=/dev/null bs=1024k') takes
about 12 seconds.

All of this is perfectly acceptable.  Compared with the old NFS server
(which runs on IRIX), we get:

(1) takes significantly longer on IRIX: about 300 seconds
(2) is somewhat faster on IRIX: it takes about 14 seconds
(3) takes about the same (around 12 seconds)

(The comparison is not entirely fair, however:  the NFS share mounted from
the IRIX machine is also exported to about 90 other clients, and does see
its fair share of I/O, while the Solaris NFS share is only mounted on
this one client.)

Alright, so what's my beef?  Well, here's the fun part:  when I try to
actually use this NFS share as my home directory (as I do with the IRIX
NFS mount), then somehow performance plummets.  Reading my inbox (~/.mail)
will take around 20 seconds (even though it has only 60 messages in it).
When I try to run 'ktrace -i mutt' with the ktrace output going to the
NFS share, then everything crawls to a halt.  While that command is
running, even a simple 'ls -la' of a small directory takes almost 5
seconds.

Neither the ktrace nor the mutt command can be killed right away --
they're blocking on I/O.

Alright, so after it finally finished, I try something a bit simpler.  'vi
plaintext'.  Ok, that's snappy as should be.  Now: 'vim plaintext'.  Ugh,
that took almost 4 seconds for the editor to come up.  There are all kinds
of other examples that I tried, but the one standing out the most was
trying to create a number of directories:

for i in `jot 100`; do
        mkdir $i
        for j in `jot 100`; do
                mkdir $i/$j
        done
done

On the IRIX NFS share, this takes about 60 seconds.

On the Solaris NFS share, this takes... forever.  (I interrupted it after
10 minutes, when it had managed to create 2500 directories.)


tcpdump and snoop show me that traffic zips by as it should for the
operations described above ((1), (2) and (3)), but become very "bursty"
when doing reads and writes simultaneously or when creating the
directories.  That is, instead of a constant stream of packets zipping by,
the tcpdump give me about 15 lines every second, but I can't find any
packet loss.

I've tried to see if this is a problem with ZFS itself:  I ran the same
tests on the file server on the ZFS, and everything seems to work just
fine there.

I've tried to mount the filesystem over TCP and with different read/write
sizes, with NFSv2 and NFSv3 (the clients don't support NFSv4).

I've tried to see if it's the NIC or the network by testing regular
network speeds and connecting the machine to a different switch etc. all
to no avail.

I've played with every setting in /etc/default/nfs to no avail, and I just
can't put my finger on it.

Alright, so in my next attempt to see if I'm crazy or not, I installed
Solaris 6/06 on another workstation.  From there, mounting a ZFS works
just dandy, all the above tests are fast.

So I reinstall the other machine.  After importing the old zpool, nothing
has changed.  I destroy the zpool and recreate it.  Still the same
problem.

To ensure that it's not the SAN switch, I connect the Solaris machine
directly to the XRaid, and again, no change.

I destroy the Raid-5 config on the XRaid and build a Raid-0 across 7 of
the disks.  Creating a zpool of only this one (striped) disk also does not
change performance at all.  Creating a regular UFS on this disk, however,
immediately fixes the problems!  So it's not the fibre channel switch,
it's not the fibre channel cables, it's not the fibre channel card, it's
not the gigabit card, it's not the machine, it's not the mount options, it
simply appears to be ZFS.  ZFS on an Apple XRaid, to be precise.  (Maybe
it's ZFS on fibre-channel, I don't know;  it's not ZFS per se, as the
other freshly installed machine with ZFS on a SATA local disk worked
fine.)

Still trying to figure out what exactly is going on here, I then took
somebody else's advise and tried to see if maybe there is a relation
between the size of the zpool and the NFS performance.

Connecting the machine in question to a different XRaid with a 745 GB
Raid-5 disk, I tried to create a single zpool on that disk. Again, the
same performance problems as noted earlier. Then I partitioned the disk
into a 100 GB partition and tried to create a zpool on that. Again, no
luck. Performance still stinks.

FWIW, format reports the xraid disks as:

       2. c3t0d0 <APPLE-Xserve RAID-1.50-2.73TB>
          /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1000,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],0
       3. c3t1d0 <APPLE-Xserve RAID-1.50-2.73TB>
          /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1000,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],0
       4. c3t2d0 <APPLE-Xserve RAID-1.26-745.21GB>
          /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1000,[EMAIL 
PROTECTED],1/[EMAIL PROTECTED],0

Is there anybody here who's using ZFS on Apple XRaids and serving them
via NFS?  Does anybody have any other ideas what I could do to solve
this?  (I have, in the mean time, converted the XRaid to plain old UFS,
and performance is perfectly fine here, but I'd still be interested in
what exactly is going on.)

-Jan

-- 
http://www.eff.org

Attachment: pgpUVR5JPze1W.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to