If you're chasing CPU utilization, specifically %sys (time in the kernel),
I would start with a time-based kernel profile.

#dtrace -n 'profile-997hz /arg0/ { @[stack()] = count(); } tick-60sec { 
trunc(@, 20); printa(@0; }'

I would be curious to see where the CPU cycles are being consumed first,
before going down the lock path…

This assume that most or all of CPU utilization is %sys. If it's %usr, we take
a different approach.

Thanks
/jim

On Mar 25, 2012, at 1:29 AM, Aubrey Li wrote:

> Hi,
> 
> I'm migrating a webserver(apache+php) from RHEL to solaris. During the
> stress testing comparison, I found under the same session number of client
> request, CPU% is ~70% on RHEL while CPU% is full on solaris.
> 
> After some investigation, zfs root lock emerges as a major doubtful point.
> 
> Firstly, there are a bunch of mutex events.
> ============================================
> # lockstat -D 10 sleep 2
> 
> Adaptive mutex spin: 1862701 events in 3.678 seconds (506499 events/sec)
> 
> Count indv cuml rcnt     nsec Lock                   Caller
> -------------------------------------------------------------------------------
> 829064  45%  45% 0.00    33280 0xffffff117624a5d0     rrw_enter_read+0x1b
> 705001  38%  82% 0.00    30983 0xffffff117624a5d0     rrw_exit+0x1d
> 140678   8%  90% 0.00    10546 0xffffff117624a6e0     zfs_zget+0x46
> 37208   2%  92% 0.00     5403 0xffffff114b136840     vn_rele+0x1e
> 33926   2%  94% 0.00     5417 0xffffff114b136840     lookuppnatcred+0xc5
> 27188   1%  95% 0.00     1155 vn_vfslocks_buckets+0xd980
> vn_vfslocks_getlock+0x3b
> 11073   1%  96% 0.00     1639 vn_vfslocks_buckets+0x4600
> vn_vfslocks_getlock+0x3b
> 9321   1%  96% 0.00     1961 0xffffff114b82a680     dnlc_lookup+0x83
> 6929   0%  97% 0.00     1590 0xffffff11573b8f28
> zfs_fastaccesschk_execute+0x6a
> 5746   0%  97% 0.00     5935 0xffffff114b136840     lookuppnvp+0x566
> -------------------------------------------------------------------------------
> 
> If we look at the lock caller's stack, the following two are the hotspots.
> ==============================================================
> # lockstat -D 10 -l 0xffffff117624a5d0 -s 15 sleep 2
> 
> Adaptive mutex spin: 865609 events in 3.237 seconds (267444 events/sec)
> -------------------------------------------------------------------------------
> Count indv cuml rcnt     nsec Lock                   Caller
> 51215   6%   6% 0.00   100023 0xffffff117624a5d0     rrw_exit+0x1d
> 
> nsec ------ Time Distribution ------ count     Stack
>  256 |                               61        zfs_root+0x78
>  512 |@                              1746      fsop_root+0x2d
> 1024 |@@@                            5210      traverse+0x65
> 2048 |@@@@@@@                        13464     kkkk+0x446
> 4096 |@@@@@@@                        13553     lookuppnatcred+0x119
> 8192 |@@@                            5355      lookupnameatcred+0x97
> 16384 |@                              2403      lookupnameat+0x6b
> 32768 |                               1331      cstatat_getvp+0x11e
> 65536 |                               673       cstatat64_32+0x5d
> 131072 |                               961       fstatat64_32+0x4c
> 262144 |                               1259      
> _sys_sysenter_post_swapgs+0x149
> -------------------------------------------------------------------------------
> Count indv cuml rcnt     nsec Lock                   Caller
> 48494   6%  17% 0.00   145263 0xffffff117624a5d0     rrw_enter_read+0x1b
> 
> nsec ------ Time Distribution ------ count     Stack
>   256 |                               17        rrw_enter+0x2c
>   512 |                               1120      zfs_root+0x3b
>  1024 |@                              1718      fsop_root+0x2d
>  2048 |@@                             4834      traverse+0x65
>  4096 |@@@@@@@@@@@                    18569     lookuppnvp+0x446
>  8192 |@@@@                           6620      lookuppnatcred+0x119
> 16384 |@                              2929      lookupnameatcred+0x97
> 32768 |@                              1635      lookupnameat+0x6b
> 65536 |                               894       cstatat_getvp+0x11e
> 131072 |                               1249      cstatat64_32+0x5d
> 262144 |@                              1620      fstatat64_32+0x4c
> 524288 |@                              2474      
> _sys_sysenter_post_swapgs+0x149
> ============================================================
> 
> The apache root documentation is apparently in zfs dataset. It looks like each
> file system operation will run into zfs root mutex contention.
> 
> Is this an expected behavior? If so, is there any zfs tunable to
> address this issue?
> 
> Thanks,
> -Aubrey
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to