On Sun, Mar 25, 2012 at 3:55 PM, Richard Elling
<richard.ell...@richardelling.com> wrote:
> On Mar 24, 2012, at 10:29 PM, Aubrey Li wrote:
>
> Hi,
>
> I'm migrating a webserver(apache+php) from RHEL to solaris. During the
> stress testing comparison, I found under the same session number of client
> request, CPU% is ~70% on RHEL while CPU% is full on solaris.
>
> After some investigation, zfs root lock emerges as a major doubtful point.
>
>
> This data does not seem major, less than 1% of total walltime.
> What is the ratio of user/sys?
>  -- richard

Here is the vmstat output.
===========================================================
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s1 s1 s2 s2   in   sy   cs us sy id
 37 0 0 25257648 34744756 26671 40446 0 0 0 0 0 0 0 258 0 54473 833569
100171 67 31 2
 16 0 0 25270872 34745544 26416 40427 0 0 0 0 0 0 0 161 0 53804 850816
100016 68 29 3
 80 0 0 25266716 34731880 27082 41500 0 0 0 0 0 0 0 145 0 56908 901578
98194 70 28 1
 81 0 0 25318348 34763484 26436 40384 0 0 0 0 0 0 0 173 0 59977
1014699 99273 66 33 0
 104 0 0 25359888 34787868 25539 39726 0 0 0 0 0 0 0 122 0 58175
989927 93106 68 32 0
 113 0 0 25359208 34779872 25746 39992 0 0 0 0 0 0 0 161 0 57212
916427 94401 67 33 0
 123 0 0 25409656 34809704 26228 41330 0 0 0 0 0 0 0 176 0 59332
890511 97127 69 31 0
 97 0 0 25419628 34807796 26508 42013 0 0 0 0 0 0 0 148 0 59736 864022
98918 71 29 0
 111 0 0 25468728 34834968 26477 40232 0 0 0 0 0 0 0 159 0 59270
867006 98308 67 33 0
 105 0 0 25501972 34860952 26776 42204 0 0 0 0 0 0 0 163 0 70868
867899 97981 71 29 0
 78 0 0 25526492 34865416 26489 43504 0 0 0 0 0 0 0 157 0 61110 952207
99806 67 33 0
==================================================================

and "mpstat -a" matches the above measure. ~200,000 spins on mutex.
==================================================================
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 35140   0 2380 59742 19476 93056 30906 32919 256336 1104 967806
65  35   0   0  32
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 36044   0 3268 61910 19482 96287 32121 32844 245293 1122 954150
64  36   0   0  32
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 36516   0 3900 61459 19478 95502 31256 35345 221473 1106 932210
66  34   0   0  32
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 35791   0 3544 60751 19371 97982 31158 33920 209179 3115 859644
64  36   0   0  32
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 38317   0 4892 63046 19566 97316 31688 38100 220262 2058 915555
65  35   0   0  32
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
  0 36408   0 3154 60766 19393 94343 31244 35582 248314  811 980967
65  35   0   0  32
===================================================================

Thanks,
-Aubrey
>
>
> Firstly, there are a bunch of mutex events.
> ============================================
> # lockstat -D 10 sleep 2
>
> Adaptive mutex spin: 1862701 events in 3.678 seconds (506499 events/sec)
>
> Count indv cuml rcnt     nsec Lock                   Caller
> -------------------------------------------------------------------------------
> 829064  45%  45% 0.00    33280 0xffffff117624a5d0     rrw_enter_read+0x1b
> 705001  38%  82% 0.00    30983 0xffffff117624a5d0     rrw_exit+0x1d
> 140678   8%  90% 0.00    10546 0xffffff117624a6e0     zfs_zget+0x46
> 37208   2%  92% 0.00     5403 0xffffff114b136840     vn_rele+0x1e
> 33926   2%  94% 0.00     5417 0xffffff114b136840     lookuppnatcred+0xc5
> 27188   1%  95% 0.00     1155 vn_vfslocks_buckets+0xd980
> vn_vfslocks_getlock+0x3b
> 11073   1%  96% 0.00     1639 vn_vfslocks_buckets+0x4600
> vn_vfslocks_getlock+0x3b
> 9321   1%  96% 0.00     1961 0xffffff114b82a680     dnlc_lookup+0x83
> 6929   0%  97% 0.00     1590 0xffffff11573b8f28
> zfs_fastaccesschk_execute+0x6a
> 5746   0%  97% 0.00     5935 0xffffff114b136840     lookuppnvp+0x566
> -------------------------------------------------------------------------------
>
> If we look at the lock caller's stack, the following two are the hotspots.
> ==============================================================
> # lockstat -D 10 -l 0xffffff117624a5d0 -s 15 sleep 2
>
> Adaptive mutex spin: 865609 events in 3.237 seconds (267444 events/sec)
> -------------------------------------------------------------------------------
> Count indv cuml rcnt     nsec Lock                   Caller
> 51215   6%   6% 0.00   100023 0xffffff117624a5d0     rrw_exit+0x1d
>
> nsec ------ Time Distribution ------ count     Stack
>  256 |                               61        zfs_root+0x78
>  512 |@                              1746      fsop_root+0x2d
> 1024 |@@@                            5210      traverse+0x65
> 2048 |@@@@@@@                        13464     kkkk+0x446
> 4096 |@@@@@@@                        13553     lookuppnatcred+0x119
> 8192 |@@@                            5355      lookupnameatcred+0x97
> 16384 |@                              2403      lookupnameat+0x6b
> 32768 |                               1331      cstatat_getvp+0x11e
> 65536 |                               673       cstatat64_32+0x5d
> 131072 |                               961       fstatat64_32+0x4c
> 262144 |                               1259
>      _sys_sysenter_post_swapgs+0x149
> -------------------------------------------------------------------------------
> Count indv cuml rcnt     nsec Lock                   Caller
> 48494   6%  17% 0.00   145263 0xffffff117624a5d0     rrw_enter_read+0x1b
>
> nsec ------ Time Distribution ------ count     Stack
>   256 |                               17        rrw_enter+0x2c
>   512 |                               1120      zfs_root+0x3b
>  1024 |@                              1718      fsop_root+0x2d
>  2048 |@@                             4834      traverse+0x65
>  4096 |@@@@@@@@@@@                    18569     lookuppnvp+0x446
>  8192 |@@@@                           6620      lookuppnatcred+0x119
> 16384 |@                              2929      lookupnameatcred+0x97
> 32768 |@                              1635      lookupnameat+0x6b
> 65536 |                               894       cstatat_getvp+0x11e
> 131072 |                               1249      cstatat64_32+0x5d
> 262144 |@                              1620      fstatat64_32+0x4c
> 524288 |@                              2474
>      _sys_sysenter_post_swapgs+0x149
> ============================================================
>
> The apache root documentation is apparently in zfs dataset. It looks like
> each
> file system operation will run into zfs root mutex contention.
>
> Is this an expected behavior? If so, is there any zfs tunable to
> address this issue?
>
> Thanks,
> -Aubrey
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
> --
> DTrace Conference, April 3,
> 2012, http://wiki.smartos.org/display/DOC/dtrace.conf
> ZFS Performance and Training
> richard.ell...@richardelling.com
> +1-760-896-4422
>
>
>
>
>
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to