[EMAIL PROTECTED] said:
> difference my tweaks are making.  Basically, the problem users  experience,
> when the load shoots up are huge latencies.  An ls on a  non-cached
> directory, which usually is instantaneous, will take 20, 30,  40 seconds or
> more.   Then when the storage array catches up, things get  better.  My
> clients are not happy campers. 
> 
> I know, I know, I should have gone with a JBOD setup, but it's too late  for
> that in this iteration of this server.  We we set this up, I had the  gear
> already, and it's not in my budget to get new stuff right now. 

What kind of array are you seeing this problem with?  It sounds very much
like our experience here with a 3-yr-old HDS ATA array.  When the crunch
came here, I didn't know enough dtrace to help, but I threw the following
into crontab to run every five minutes (24x7), and it at least collected
the info I needed to see what LUN/filesystem was busying things out.

Way crude, but effective enough:

  /bin/ksh -c "date && mpstat 2 20 && iostat -xn 2 20 \
    && fsstat $(zfs list -H -o mountpoint -t filesystem | egrep '^/') 2 20 \
    && vmstat 2 20" >> /var/tmp/iostats.log 2>&1 </dev/null

A quick scan using "egrep" could pull out trouble spots;  E.g. the following
would identify "iostat" lines that showed 90-100% busy:

  egrep '^Sun |^Mon |^Tue |^Wed |^Thu |^Fri |^Sat | 1[0-9][0-9] c6|  9[0-9] 
c6'\
    /var/tmp/iostats.log

I know it's not the Dtrace you were asking for, but maybe it'll inspire
something more useful than the above shotgun approach.

Regards,

Marion


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to