Hello.

I'm running into a recurring problem.  I tried to search the list for
some info, but couldn't quite find anything related (there are some
discussions on interrupt storms lately, but none seem to apply).

I'm running FreeBSD-5.x on some old low end boxes, mostly for small
tasks like small websites, email servers, and so on.

Some time ago, on some of the boxes (with similar hardware - AMD Athlon
1.0GHz and 1.4GHz, MSI mainboards with VIA chipsets), I noticed a
unusually high interrupt rate - top says around 10% CPU time at all
times, even when the box is completely idle.  The guilty process,
according to top -S, is :

   27 root     -28 -147     0K    12K RUN     17.9H  8.06%  8.06% swi5: clock 
sio
 
Since those are production boxes, with custom kernels and all, I left
them alone.

Now, I have to mount another machine with old and used hardware, and I
fall into the same problems, juste much worse.  I tried two motherboards
with completely different hardware (Celeron 600 with intel chip versus
VIA C3 Samuel 2 with, well, VIA chip), and I have the same symptoms,
just much worse :

   27 root     -28 -147     0K    12K WAIT     5:12 23.93% 23.93% swi5: clock 
sio

uname -a shows :

FreeBSD cragganmore 5.3-STABLE FreeBSD 5.3-STABLE #0: Mon Nov 15
20:33:56 CET 2004     [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386 

(The box was upgraded from 5.3-BETAx.  I made a GENERIC kernel to see if
my custom config was not at fault, but no such luck.  All was recompiled
with no special tunables - the only line of interest in make.conf is
'CPUTYPE?=i586'.) 

After a few quick tests, it seems that the machine boots cleanly (no
such load), but it begins to break under any kind of load : to stress
it, I tried a make -j8 buildworld, and it took just a few minutes.

Of course, once it begins, even if I leave the machine alone, the load
stays the same.  Some samples :

1) During the build :

last pid: 12394;  load averages:  7.65,  5.21,  2.54      up 0+00:07:42  
10:28:45
105 processes: 10 running, 71 sleeping, 24 waiting
CPU states: 49.2% user,  0.0% nice, 25.4% system, 25.4% interrupt,  0.0% idle
Mem: 16M Active, 36M Inact, 35M Wired, 12K Cache, 59M Buf, 398M Free
Swap: 1024M Total, 1024M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
   27 root     -28 -147     0K    12K WAIT     0:32 24.02% 24.02% swi5: clock 
sio
    9 root     171   52     0K    12K RUN      0:09  0.68%  0.68% pagezero

2) Just after I hit ctrl-C :

last pid: 12668;  load averages:  3.64,  4.56,  2.46      up 0+00:08:37  
10:29:40
73 processes:  2 running, 47 sleeping, 24 waiting
CPU states:  0.0% user,  0.0% nice,  0.4% system, 24.5% interrupt, 75.1% idle
Mem: 9684K Active, 36M Inact, 35M Wired, 12K Cache, 59M Buf, 405M Free
Swap: 1024M Total, 1024M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
   11 root     132    0     0K    12K RUN      1:37 65.28% 65.28% idle
   27 root     -28 -147     0K    12K WAIT     0:45 22.71% 22.71% swi5: clock 
sio

3) Half an hour later :

last pid: 12737;  load averages:  0.00,  0.02,  0.40      up 0+00:33:38  
10:54:41
73 processes:  2 running, 47 sleeping, 24 waiting
CPU states:  0.8% user,  0.0% nice,  0.0% system, 25.6% interrupt, 73.6% idle
Mem: 9768K Active, 37M Inact, 35M Wired, 12K Cache, 59M Buf, 403M Free
Swap: 1024M Total, 1024M Free

  PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
   11 root     107    0     0K    12K RUN     20:29 75.54% 75.54% idle
   27 root     -28 -147     0K    12K WAIT     6:47 23.05% 23.05% swi5: clock 
sio

Strangely, it seems too that the load average falls much slower than
expected (3.5 to 0.0 in more than one minute for the first number).

On the other hand vmstat -i doesn't show anything anormal :

interrupt                          total       rate
irq0: clk                         349094         99
irq1: atkbd0                           2          0
irq8: rtc                         446819        127
irq11: rl0 uhci0+                  10318          2
irq13: npx0                            2          0
irq14: ata0                         9015          2
irq15: ata1                           48          0
Total                             815298        233

Of course, strangely enough, none of these boxes have any kind of device
behind com ports (which are driven by sio, right ?).

(BTW, on any kernel, I never had any "interrupt storm" messages - maybe
10-25% CPU is too low for that ? :) )

Well, this is it.  I don't know what I can do to provide more
information, but it's a test box, I can break it at will.  You can find
a dmesg output from a verbose boot at :

http://www.lacave.net/~fred/dmesg.boot

Fred
-- 
Sysadmins can't be sued for malpractice, but surgeons don't have to
deal with patients who install new versions of their own innards.

_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to