We have a serious performance problem on our server. Here is some data: <pre> > ::memstat Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 1133252 4426 31% Anon 1956988 7644 53% Exec and libs 31104 121 1% Page cache 332818 1300 9% Free (cachelist) 77813 303 2% Free (freelist) 135815 530 4%
Total 3667790 14327 Physical 3593201 14035 > </pre> <pre> sar -u 5 10: 18:06:58 %usr %sys %wio %idle 18:07:03 8 57 0 35 18:07:08 3 22 0 75 18:07:14 3 66 0 31 18:07:19 3 16 0 81 18:07:24 4 52 0 44 18:07:29 3 20 0 77 18:07:34 2 60 0 38 18:07:39 2 39 0 59 18:07:44 2 50 0 48 18:07:49 2 21 0 77 Average 3 40 0 57 </pre> A lot of system time is eating up the CPU. Using vmstat shows: <pre> kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy cs us sy id 0 0 0 2593264 373392 182 1013 4 0 0 0 0 0 0 0 23 1483 9112 1862 3 9 88 0 0 0 2647980 425032 246 1168 2 0 0 0 0 0 0 0 23 896 23589 2229 4 10 86 0 0 0 2645524 424328 221 1091 3 0 0 0 0 0 0 0 20 872 8795 1870 3 9 88 0 0 0 2621896 403968 206 969 2 0 0 0 0 0 0 0 23 839 9171 2091 3 9 88 0 0 0 2601288 382732 217 946 2 0 0 0 0 0 0 0 21 679 98075 1783 3 11 87 0 0 0 2580244 362876 239 1161 3 0 0 0 0 0 0 0 55 1649 106163 2221 5 13 82 0 0 0 2651656 420528 225 1181 2 0 0 0 0 0 0 0 16 645 9846 1887 3 9 88 0 0 0 2697620 428048 268 1449 481 0 0 0 0 0 0 0 35 1339 50362 2453 3 11 85 3 0 0 2956632 488440 180 712 37 0 0 0 0 0 0 0 78 907 58331 2310 3 13 84 0 0 0 2643064 382292 339 1884 294 0 0 0 0 0 0 0 45 1893 9649 2282 5 11 84 0 0 0 2784544 422340 224 1192 8 0 0 0 0 0 0 0 88 1041 112430 4572 7 15 79 (this is when system bogs down) 12 0 0 2815300 406156 292 1451 66 0 0 0 0 0 0 0 282 4993 110489 4649 6 24 70 11 0 0 2596252 370944 304 1910 27 0 0 0 0 0 0 0 223 2404 57232 3445 7 48 45 12 0 0 2654676 423784 199 1016 10 0 0 0 0 0 0 0 203 1470 9183 3672 3 48 49 6 0 0 2601900 380100 218 1039 7 0 0 0 0 0 0 0 221 2310 10486 4025 4 41 56 10 0 0 2649432 407956 332 1484 16 0 0 0 0 0 0 0 198 3757 10921 4291 6 40 53 8 0 0 2626320 397504 198 1101 14 0 0 0 0 0 0 0 203 1840 10345 3940 5 40 55 19 0 0 2598156 375780 209 1188 2 0 0 0 0 0 0 0 229 2229 8940 3465 10 48 42 18 0 0 2643936 423656 176 794 9 0 0 0 0 0 0 0 168 1306 8182 3165 8 39 53 7 0 0 2711160 474176 248 675 22 0 0 0 0 0 0 0 84 1147 8616 2461 2 23 74 </pre> <br> Notice the run queue. Is there a DTrace script (from the DTT package) that I can use to figure out what is going on? <br><br> mpstat shows: <pre> CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 3511 48 97 784 197 1067 32 239 365 0 5814 5 43 0 52 1 1287 28 43 429 0 901 37 215 314 0 2821 3 40 0 57 2 2954 54 155 1442 1079 1176 26 241 339 0 4927 4 42 0 54 3 1364 20 886 167 16 655 32 184 299 0 3939 4 41 0 55 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 3523 14 46 486 197 1129 50 251 411 0 6895 7 52 0 41 1 1536 8 31 119 0 922 53 220 375 0 4149 4 51 0 45 2 3160 11 76 1251 1177 1058 56 239 403 0 5987 5 57 0 38 3 1592 5 38 102 2 725 50 189 363 0 3929 4 51 0 45 </pre> and when things *appear* to be good: <pre> CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 355 0 14 680 202 631 5 146 67 0 2225 2 13 0 85 1 59 0 804 29 0 593 13 173 48 0 1948 2 3 0 95 2 455 0 13 648 363 675 7 179 43 0 4473 3 8 0 89 3 96 0 7 293 2 419 6 165 40 0 2434 2 9 0 89 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 379 0 12 610 202 821 7 174 62 0 1594 4 7 0 89 1 189 0 23 223 0 646 15 182 49 0 1695 3 7 0 90 2 322 0 582 565 535 695 10 169 45 0 2477 12 14 0 75 3 216 0 9 221 2 439 11 168 39 0 1845 12 5 0 83 </pre> (the idle time is much higher) The only thing I see is a high smtx? -- This message posted from opensolaris.org _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org