On Mon, May 31, 2010 at 4:32 PM, Sandon Van Ness <san...@van-ness.com> wrote: > On 05/31/2010 01:51 PM, Bob Friesenhahn wrote: >> There are multiple factors at work. Your OpenSolaris should be new >> enough to have the fix in which the zfs I/O tasks are run in in a >> scheduling class at lower priority than normal user processes. >> However, there is also a throttling mechanism for processes which >> produce data faster than can be consumed by the disks. This >> throttling mechanism depends on the amount of RAM available to zfs and >> the write speed of the I/O channel. More available RAM results in >> more write buffering, which results in a larger chunk of data written >> at the next transaction group write interval. The maximum size of a >> transaction group may be configured in /etc/system similar to: >> >> * Set ZFS maximum TXG group size to 2684354560 >> set zfs:zfs_write_limit_override = 0xa0000000 >> >> If the transaction group is smaller, then zfs will need to write more >> often. Processes will still be throttled but the duration of the >> delay should be smaller due to less data to write in each burst. I >> think that (with multiple writers) the zfs pool will be "healthier" >> and less fragmented if you can offer zfs more RAM and accept some >> stalls during writing. There are always tradeoffs. >> >> Bob > well it seems like when messing with the txg sync times and stuff like > that it did make the transfer more smooth but didn't actually help with > speeds as it just meant the hangs happened for a shorter time but at a > smaller interval and actually lowering the time between writes just > seemed to make things worse (slightly). > > I think I have came to the conclusion that the problem here is CPU due > to the fact that its only doing this with parity raid. I would think if > it was I/O based then it would be the same as if anything its heavier on > I/O on non parity raid due to the fact that it is no longer CPU > bottlenecked (dd write test gives me near 700 megabytes/sec vs 450 with > parity raidz2).
To see if the CPU is pegged, take a look at the output of: mpstat 1 prstat -mLc 1 If mpstat shows that the idle time reaches 0 or the process' latency column is more then a few tenths of a percent, you are probably short on CPU. It could also be that interrupts are stealing cycles from rsync. Placing it in a processor set with interrupts disabled in that processor set may help. -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss