[dtrace-discuss] performance troubleshooting

Anil Wed, 08 Jul 2009 20:20:33 -0700

We have a serious performance problem on our server. Here is some data:
<pre>
> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    1133252              4426   31%
Anon                      1956988              7644   53%
Exec and libs               31104               121    1%
Page cache                 332818              1300    9%
Free (cachelist)            77813               303    2%
Free (freelist)            135815               530    4%


Total                     3667790             14327
Physical                  3593201             14035
> 
</pre>

<pre>
sar -u 5 10:
18:06:58    %usr    %sys    %wio   %idle
18:07:03       8      57       0      35
18:07:08       3      22       0      75
18:07:14       3      66       0      31
18:07:19       3      16       0      81
18:07:24       4      52       0      44
18:07:29       3      20       0      77
18:07:34       2      60       0      38
18:07:39       2      39       0      59
18:07:44       2      50       0      48
18:07:49       2      21       0      77

Average        3      40       0      57
</pre>


A lot of system time is eating up the CPU. Using vmstat shows:


<pre>
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s1 s2 s3 s4   in   sy   cs us sy id
 0 0 0 2593264 373392 182 1013 4 0 0  0  0  0  0  0 23 1483 9112 1862  3  9 88
 0 0 0 2647980 425032 246 1168 2 0 0  0  0  0  0  0 23  896 23589 2229 4 10 86
 0 0 0 2645524 424328 221 1091 3 0 0  0  0  0  0  0 20  872 8795 1870  3  9 88
 0 0 0 2621896 403968 206 969 2 0  0  0  0  0  0  0 23  839 9171 2091  3  9 88
 0 0 0 2601288 382732 217 946 2 0  0  0  0  0  0  0 21  679 98075 1783 3 11 87
 0 0 0 2580244 362876 239 1161 3 0 0  0  0  0  0  0 55 1649 106163 2221 5 13 82
 0 0 0 2651656 420528 225 1181 2 0 0  0  0  0  0  0 16  645 9846 1887  3  9 88
 0 0 0 2697620 428048 268 1449 481 0 0 0 0  0  0  0 35 1339 50362 2453 3 11 85
 3 0 0 2956632 488440 180 712 37 0 0  0  0  0  0  0 78  907 58331 2310 3 13 84
 0 0 0 2643064 382292 339 1884 294 0 0 0 0  0  0  0 45 1893 9649 2282  5 11 84
 0 0 0 2784544 422340 224 1192 8 0 0  0  0  0  0  0 88 1041 112430 4572 7 15 79

(this is when system bogs down)

 12 0 0 2815300 406156 292 1451 66 0 0 0 0  0  0  0 282 4993 110489 4649 6 24 70
 11 0 0 2596252 370944 304 1910 27 0 0 0 0  0  0  0 223 2404 57232 3445 7 48 45
 12 0 0 2654676 423784 199 1016 10 0 0 0 0  0  0  0 203 1470 9183 3672 3 48 49
 6 0 0 2601900 380100 218 1039 7 0 0  0  0  0  0  0 221 2310 10486 4025 4 41 56
 10 0 0 2649432 407956 332 1484 16 0 0 0 0  0  0  0 198 3757 10921 4291 6 40 53
 8 0 0 2626320 397504 198 1101 14 0 0 0  0  0  0  0 203 1840 10345 3940 5 40 55
 19 0 0 2598156 375780 209 1188 2 0 0 0  0  0  0  0 229 2229 8940 3465 10 48 42
 18 0 0 2643936 423656 176 794 9 0 0  0  0  0  0  0 168 1306 8182 3165 8 39 53
 7 0 0 2711160 474176 248 675 22 0 0  0  0  0  0  0 84 1147 8616 2461  2 23 74
</pre>

<br>
Notice the run queue. Is there a DTrace script (from the DTT package) that I 
can use to figure out what is going on?
<br><br>

mpstat shows:
<pre>
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0 3511  48   97   784  197 1067   32  239  365    0  5814    5  43   0  52
  1 1287  28   43   429    0  901   37  215  314    0  2821    3  40   0  57
  2 2954  54  155  1442 1079 1176   26  241  339    0  4927    4  42   0  54
  3 1364  20  886   167   16  655   32  184  299    0  3939    4  41   0  55
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0 3523  14   46   486  197 1129   50  251  411    0  6895    7  52   0  41
  1 1536   8   31   119    0  922   53  220  375    0  4149    4  51   0  45
  2 3160  11   76  1251 1177 1058   56  239  403    0  5987    5  57   0  38
  3 1592   5   38   102    2  725   50  189  363    0  3929    4  51   0  45
</pre>

and when things *appear* to be good:

<pre>
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  355   0   14   680  202  631    5  146   67    0  2225    2  13   0  85
  1   59   0  804    29    0  593   13  173   48    0  1948    2   3   0  95
  2  455   0   13   648  363  675    7  179   43    0  4473    3   8   0  89
  3   96   0    7   293    2  419    6  165   40    0  2434    2   9   0  89
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  379   0   12   610  202  821    7  174   62    0  1594    4   7   0  89
  1  189   0   23   223    0  646   15  182   49    0  1695    3   7   0  90
  2  322   0  582   565  535  695   10  169   45    0  2477   12  14   0  75
  3  216   0    9   221    2  439   11  168   39    0  1845   12   5   0  83
</pre>

(the idle time is much higher)

The only thing I see is a high smtx?
-- 
This message posted from opensolaris.org
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

[dtrace-discuss] performance troubleshooting

Reply via email to