On Fri, Ben Rockwood wrote:
> eric kustarz wrote:
> >So i'm guessing there's lots of files being created over NFS in one 
> >particular dataset?
> >
> >We should figure out how many creates/second you are doing over NFS (i 
> >should have put a timeout on the script).  Here's a real simple one 
> >(from your snoop it looked like you're only doing NFSv3, so i'm not 
> >tracking NFSv4):
> >"
> >#!/usr/sbin/dtrace -s
> >
> >rfs3_create:entry,
> >zfs_create:entry
> >{
> >        @creates[probefunc] = count();
> >}
> >
> >tick-60s
> >{
> >        exit(0);
> >}
> >"
> 
> 
> Eric, I love you. 
> 
> Running this bit of DTrace reveled more than 4,000 files being created 
> in almost any given 60 second window.  And I've only got one system that 
> would fit that sort of mass file creation: our Joyent Connector products 
> Courier IMAP server which uses Maildir.  As a test I simply shutdown 
> Courier and unmounted the mail NFS share for good measure and sure 
> enough the problem vanished and could not be reproduced.  10 minutes 
> later I re-enabled Courier and our problem came back. 
> 
> Clearly ZFS file creation is just amazingly heavy even with ZIL 
> disabled.  If creating 4,000 files in a minute squashes 4 2.6Ghz Opteron 
> cores we're in big trouble in the longer term.  In the meantime I'm 
> going to find a new home for our IMAP Mail so that the other things 
> served from that NFS server at least aren't effected.
> 
> You asked for the zpool and zfs info, which I don't want to share 
> because its confidential (if you want it privately I'll do so, but not 
> on a public list), but I will say that its a single massive Zpool in 
> which we're using less than 2% of the capacity.   But in thinking about 
> this problem, even if we used 2 or more pools, the CPU consumption still 
> would have choked the system, right?  This leaves me really nervous 
> about what we'll do when its not an internal mail server thats creating 
> all those files but a customer. 
> 
> Oddly enough, this might be a very good reason to use iSCSI instead of 
> NFS on the Thumper.
> 
> Eric, I owe you a couple cases of beer for sure.  I can't tell you how 
> much I appreciate your help.  Thanks to everyone else who chimed in with 
> ideas and suggestions, all of you guys are the best!

Good to hear that you have figured out what is happening, Ben.

For future reference, there are two commands that you may want to
make use of in observing the behavior of the NFS server and individual
filesystems.

There is the trusty, nfsstat command.  In this case, you would have been
able to do something like:
        nfsstat -s -v3 60

This will provide all of the server side NFSv3 statistics on 60 second
intervals.  

Then there is a new command fsstat that will provide vnode level
activity on a per filesystem basis.  Therefore, if the NFS server
has multiple filesystems active and you want ot look at just one
something like this can be helpful:

        fsstat /export/foo 60

Fsstat has a 'full' option that will list all of the vnode operations
or just certain types.  It also will watch a filesystem type (e.g. zfs, nfs).
Very useful.

Spencer

Reply via email to