On 05/31/2010 02:32 PM, Sandon Van Ness wrote: > well it seems like when messing with the txg sync times and stuff like > that it did make the transfer more smooth but didn't actually help with > speeds as it just meant the hangs happened for a shorter time but at a > smaller interval and actually lowering the time between writes just > seemed to make things worse (slightly). > > I think I have came to the conclusion that the problem here is CPU due > to the fact that its only doing this with parity raid. I would think if > it was I/O based then it would be the same as if anything its heavier on > I/O on non parity raid due to the fact that it is no longer CPU > bottlenecked (dd write test gives me near 700 megabytes/sec vs 450 with > parity raidz2). > > So if I am understnading things the issueI am seeing should be fixed but > aparrantly its not (in my case) as CPU usage from parity/zfs > calculations are takin g precidence over my process doing the writting > (rsync)? > > I think I have near 100% come to the conclusion that the issue is CPU > based due the fact I saw the same dips even when using mbuffer
And here is some top output the slowdowns occur when zfs-pool starts using cpu and rsnc gets CPU starved: Normal activity shows: last pid: 22635; load avg: 2.17, 2.18, 2.16; up 0+18:04:42 14:53:29 59 processes: 57 sleeping, 1 running, 1 on cpu CPU states: 54.7% idle, 23.4% user, 21.9% kernel, 0.0% iowait, 0.0% swap Kernel: 37646 ctxsw, 193 trap, 20914 intr, 45295 syscall Memory: 4027M phys mem, 190M free mem, 2013M total swap, 2013M free swap PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND 1326 root 1 59 -20 383M 44M run 496:47 28.87% rsync 1322 root 1 59 -20 383M 357M sleep 11:21 0.70% rsync 3 root 1 60 -20 0K 0K sleep 1:24 0.06% fsflush when starved: last pid: 22636; load avg: 2.16, 2.18, 2.16; up 0+18:05:16 14:54:03 59 processes: 57 sleeping, 2 on cpu CPU states: 24.9% idle, 10.5% user, 64.6% kernel, 0.0% iowait, 0.0% swap Kernel: 17855 ctxsw, 18 trap, 12831 intr, 21090 syscall Memory: 4027M phys mem, 198M free mem, 2013M total swap, 2013M free swap PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND 604 root 39 99 -20 0K 0K cpu/0 316:55 53.36% zpool-data 1326 root 1 59 -20 383M 44M sleep 497:03 13.49% rsync 1322 root 1 59 -20 383M 357M sleep 11:21 0.33% rsync 22635 root 1 59 0 3852K 1912K cpu/1 0:00 0.06% top 3 root 1 60 -20 0K 0K sleep 1:24 0.06% fsflush The stall actually happens less than a second but the solaris version of top doesn't seem to be able to take <1 values (other than 0) when using -s like you can on linux (-d .5) otherwise I think the zpool-data would be near 100% cpu during the stall. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss