We have a condition where 4 caches, every 33-34 minutes goes high CPU
utliization (1000%) and prevents incoming connections. This last about 30
seconds and goes away.

It seems to occur with some specific origin/content which I don't exactly
know which one so far.

I'm including some details (in case it's obvious), but how do we profile a
running server to see what's causing this high cpu usage?

The load is normal, maybe about 100Mbps (basically idling).

top - 17:20:33 up 61 days,  3:32,  4 users,  load average: 4.74, 1.20, 0.53
Tasks: 616 total,  16 running, 600 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.6%us, 22.3%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi,  0.2%si,
 0.0%st
Mem:  396790288k total, 342530608k used, 54259680k free,   148764k buffers
Swap:  4095996k total,        0k used,  4095996k free,  1618892k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  8703 ats       20   0  282g 275g  39m D 853.9 72.7  12496:54 [ET_NET 0]
    61 root      20   0     0    0    0 R  5.6  0.0   5:10.11 ksoftirqd/14
    77 root      20   0     0    0    0 R  5.0  0.0   5:15.08 ksoftirqd/18
   125 root      20   0     0    0    0 R  5.0  0.0   5:02.48 ksoftirqd/30
    13 root      20   0     0    0    0 R  4.6  0.0   5:23.63 ksoftirqd/2
    37 root      20   0     0    0    0 R  4.6  0.0   5:10.31 ksoftirqd/8
    45 root      20   0     0    0    0 R  4.6  0.0   5:10.36 ksoftirqd/10
    93 root      20   0     0    0    0 R  4.6  0.0   5:29.05 ksoftirqd/22
   109 root      20   0     0    0    0 R  4.6  0.0   5:06.27 ksoftirqd/26
    21 root      20   0     0    0    0 R  4.3  0.0   5:12.82 ksoftirqd/4
    29 root      20   0     0    0    0 R  4.3  0.0   5:10.79 ksoftirqd/6
    85 root      20   0     0    0    0 R  4.3  0.0   5:18.36 ksoftirqd/20
   101 root      20   0     0    0    0 R  4.3  0.0   5:13.54 ksoftirqd/24
     4 root      20   0     0    0    0 R  4.0  0.0   5:16.25 ksoftirqd/0
    69 root      20   0     0    0    0 R  4.0  0.0   5:09.72 ksoftirqd/16
   117 root      20   0     0    0    0 R  3.0  0.0   5:03.74 ksoftirqd/28
    53 root      20   0     0    0    0 S  1.7  0.0   5:07.40 ksoftirqd/12

Here's a "dstat" output of that server when this is occurring... updated
every second.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  0   0  98   1   0   0|7940k   10M|6270k   18M|   0     0 |  13k   35k
  0   0  99   0   0   0|9738k 4160k|1600k   10M|   0     0 |  12k   34k
  0   0  99   0   0   0|3232k 4136k|1390k 5724k|   0     0 |  11k   34k
  0   0 100   0   0   0|   0  4096k|  34k  555k|   0     0 |  10k   33k
  0   0  99   0   0   0|1316k 9095k|7028k 7240k|   0     0 |  11k   34k
  0   1  99   0   0   0| 288k 8658k|6189k 5878k|   0     0 |  29k   36k
  0   0  99   1   0   0|  13M 6204k|2870k   17M|   0     0 |  14k   35k
  0   0  99   0   0   0|   0  4096k|3009k 3387k|   0     0 |  14k   35k
  0   0  99   0   0   0|1028k 6384k|5138k 6047k|   0     0 |  14k   35k
  0   0  99   1   0   0|5127k 8192k|4262k 7417k|   0     0 |  13k   34k
  0   0  99   0   0   0|2372k 2612k|2011k 4944k|   0     0 |  14k   35k
  0   0  99   0   0   0|   0  4172k|1633k 2201k|   0     0 |  13k   35k
  0   0  99   0   0   0|  11M  608k|2528k   12M|   0     0 |  13k   35k
  0   3  95   1   0   0|  15M 8409k|  11M   28M|   0     0 |  16k   37k
  0   3  95   1   0   0|6324k 4244k|7959k   20M|   0     0 |  12k   32k
  0   4  96   0   0   0| 256k 2128k| 853k 2823k|   0     0 |  27k   32k
  0   3  96   0   0   0|1028k    0 | 472k  562k|   0     0 |  10k   31k
  0   3  96   0   0   0|1028k   68k|1457k 1454k|   0     0 |  11k   32k
  0   3  97   0   0   0|   0     0 | 316k  317k|   0     0 |  11k   31k
  0   3  97   0   0   0|3076k    0 | 415k 3251k|   0     0 |9498    30k
  0   3  97   0   0   0|2688k   84k|  16k 2985k|   0     0 |8623    30k
  0   3  97   0   0   0|   0     0 |  64k   65k|   0     0 |9121    30k
  0   3  97   0   0   0|   0    68k|1607k 1556k|   0     0 |  11k   31k
  0   3  97   0   0   0|   0     0 |  18k  120k|   0     0 |7669    28k
  0   3  97   0   0   0|   0     0 | 536k  537k|   0     0 |8660    30k
  0   4  96   0   0   0|   0    72k|  65k   75k|   0     0 |  26k   30k
  0   3  96   0   0   0|   0  2104k|3079k 3093k|   0     0 |9267    30k
  0   3  97   0   0   0|   0   108k| 593k  506k|   0     0 |7591    28k
  0   3  96   0   0   0|1152k 4096B| 124k  238k|   0     0 |7914    29k
  0   3  97   0   0   0|   0     0 |5719B 1574B|   0     0 |8884    29k
  0   3  96   0   0   0|   0   224k|  18k   51k|   0     0 |9092    30k
  0   3  97   0   0   0|   0     0 |  87k  108k|   0     0 |8998    29k
  0   3  96   0   0   0|5328k    0 |1594k 6790k|   0     0 |8499    29k
  0   3  97   0   0   0|   0   156k|  44k  203k|   0     0 |8039    29k
  0   3  97   0   0   0|   0     0 |  17k   82k|   0     0 |5377    25k
  0   4  96   0   0   0|   0   252k|3224B 6858B|   0     0 |  28k   31k
  0   3  96   0   0   0|   0     0 |2139B 8535B|   0     0 |  10k   29k
  0   3  97   0   0   0|5508k    0 |  20k 5516k|   0     0 |7762    29k
  0   3  97   0   0   0|   0    72k|  10k   28k|   0     0 |7026    29k
  0   3  97   0   0   0|   0     0 |3782B  712B|   0     0 |5829    29k
  0   4  96   0   0   0|   0   264k|  11k   10k|   0     0 |5170   183k
  0   4  95   0   0   0|   0     0 |  15k 1415B|   0     0 |4125   245k
  0   3  97   0   0   0|   0     0 |5416B  966B|   0     0 |3608    40k
  0   3  97   0   0   0|   0  4096B|6233B 3868B|   0     0 |3636    41k
  0   3  96   0   0   0|   0    64k|5761B 1594B|   0     0 |3579    41k
  2  20  78   0   0   0|   0    80k|  11k 1544B|   0     0 |  16k 3590k
  3  26  71   0   0   0|   0     0 |  13k 9997B|   0     0 |  12k 4969k
  3  26  71   0   0   0|   0     0 |6685B 1042B|   0     0 |  12k 5017k
  3  25  71   0   0   0|   0     0 |  11k 1660B|   0     0 |  12k 4957k
  3  25  72   0   0   0|   0     0 |7765B 4463B|   0     0 |  12k 4862k
  3  25  72   0   0   0|   0   128k|8377B 1478B|   0     0 |  12k 4875k
  3  25  72   0   0   0|   0     0 |4407B  992B|   0     0 |  12k 4814k
  3  25  71   0   0   0|   0     0 |  17k 1959B|   0     0 |  12k 4993k
  3  26  71   0   0   0|   0     0 |7774B 1478B|   0     0 |  12k 4986k
  3  25  72   0   0   0|   0     0 |5728B 1066B|   0     0 |  12k 4921k
  3  25  71   0   0   0|   0   108k|  14k 1816B|   0     0 |  19k 4922k
  3  25  72   0   0   0|   0     0 |6296B 9096B|   0     0 |  12k 4899k
  3  25  72   0   0   0|   0     0 |8974B 1166B|   0     0 |  12k 4914k
  3  26  71   0   0   0|   0     0 |  12k 1038B|   0     0 |  12k 5133k
  3  25  72   0   0   0|   0     0 |9956B 1668B|   0     0 |  12k 4873k
  3  25  72   0   0   0|   0   228k|8025B 2172B|   0     0 |  12k 4841k
  3  25  72   0   0   0|   0     0 |7743B 1728B|   0     0 |  12k 4888k
  3  26  71   0   0   0|   0     0 |8287B 6061B|   0     0 |  12k 4955k
  3  26  70   0   0   0|   0     0 |8702B 1602B|   0     0 |  12k 5064k
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  3  26  71   0   0   0|   0     0 |8898B 1198B|   0     0 |  12k 4983k
  3  25  72   0   0   0|   0   152k|  10k   48k|   0     0 |  17k 4840k
  3  25  72   0   0   0|   0     0 |4030B 8837B|   0     0 |  15k 4900k
  3  25  72   0   0   0|   0     0 |  12k 2260B|   0     0 |  12k 4909k
  3  26  71   0   0   0|   0     0 |  14k 2169B|   0     0 |  12k 5001k
  3  26  70   0   0   0|   0     0 |  12k 1470B|   0     0 |  12k 5047k
  3  26  71   0   0   0|   0   232k|  14k 1964B|   0     0 |  12k 4967k
  3  25  72   0   0   0|   0     0 |  14k 2115B|   0     0 |  12k 4846k
  3  25  72   0   0   0|   0     0 |  15k 2144B|   0     0 |  12k 4945k
  3  25  72   0   0   0|   0     0 |  15k 1544B|   0     0 |  12k 4862k
  3  26  71   0   0   0|   0     0 |6595B 1124B|   0     0 |  12k 4991k
  3  27  70   0   0   0|   0   128k|  39k 7960B|   0     0 |  17k 5065k
  3  26  71   0   0   0|   0     0 |  13k   11k|   0     0 |  12k 4942k
  3  25  72   0   0   0|   0     0 |9035B 2072B|   0     0 |  12k 4881k
  3  25  72   0   0   0|   0     0 |9057B 2788B|   0     0 |  12k 4825k
  1   5  90   4   0   0|  41M   28M|  49M  104M|   0     0 |  30k  848k
  0   1  92   7   0   0|  28M   51M|  46M  166M|   0     0 |  22k   39k
  0   0  97   2   0   0|  19M   20M|  18M   48M|   0     0 |  15k   36k
  0   0  97   2   0   0|  21M   14M|  13M   41M|   0     0 |  15k   36k
  0   0  97   3   0   0|  13M   16M|  21M   33M|   0     0 |  14k   36k
  0   0  96   3   0   0|  14M   20M|  21M   42M|   0     0 |  14k   36k
  0   1  96   3   0   0|  15M   15M|  16M   32M|   0     0 |  24k   36k
  0   0  99   1   0   0|  11M 4351k|3789k   18M|   0     0 |  12k   34k
  0   0  99   0   0   0|7182k    0 |1881k 8056k|   0     0 |  11k   33k
  0   0  99   0   0   0|3164k  100k|2091k 6605k|   0     0 |  11k   35k
  0   0  99   0   0   0|3079k 2647k|6850k 6604k|   0     0 |  11k   33k
  0   0  98   1   0   0|8640k 7195k|6742k   17M|   0     0 |  12k   34k
  0   0  99   1   0   0|   0  4451k|6093k 7067k|   0     0 |  10k   33k
  0   0  99   1   0   0|3908k 4787k|1294k 4556k|   0     0 |  11k   33k
  0   0  99   0   0   0|7460k   44k|1505k 9884k|   0     0 |9809    32k

Thanks,

Steve

Reply via email to