[EMAIL PROTECTED] said: > difference my tweaks are making. Basically, the problem users experience, > when the load shoots up are huge latencies. An ls on a non-cached > directory, which usually is instantaneous, will take 20, 30, 40 seconds or > more. Then when the storage array catches up, things get better. My > clients are not happy campers. > > I know, I know, I should have gone with a JBOD setup, but it's too late for > that in this iteration of this server. We we set this up, I had the gear > already, and it's not in my budget to get new stuff right now.
What kind of array are you seeing this problem with? It sounds very much like our experience here with a 3-yr-old HDS ATA array. When the crunch came here, I didn't know enough dtrace to help, but I threw the following into crontab to run every five minutes (24x7), and it at least collected the info I needed to see what LUN/filesystem was busying things out. Way crude, but effective enough: /bin/ksh -c "date && mpstat 2 20 && iostat -xn 2 20 \ && fsstat $(zfs list -H -o mountpoint -t filesystem | egrep '^/') 2 20 \ && vmstat 2 20" >> /var/tmp/iostats.log 2>&1 </dev/null A quick scan using "egrep" could pull out trouble spots; E.g. the following would identify "iostat" lines that showed 90-100% busy: egrep '^Sun |^Mon |^Tue |^Wed |^Thu |^Fri |^Sat | 1[0-9][0-9] c6| 9[0-9] c6'\ /var/tmp/iostats.log I know it's not the Dtrace you were asking for, but maybe it'll inspire something more useful than the above shotgun approach. Regards, Marion _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss