Re: 2.6.21-git8+ BUG: NMI Watchdog detected LOCKUP on CPU1

2007-05-08 Thread Andrew Morton
On Tue, 08 May 2007 10:35:14 +0200 Michal Piotrowski [EMAIL PROTECTED] wrote:

 Hi,
 
 / filesystem was full
 
 [39525.46] BUG: NMI Watchdog detected LOCKUP on CPU1, eip 08056990, 
 registers:
 [39525.468000] Modules linked in: loop ipt_MASQUERADE iptable_nat nf_nat 
 autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 
 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT 
 xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal 
 processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus 
 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
 snd_pcm_oss snd_mixer_oss evdev snd_pcm intel_agp snd_timer snd agpgart 
 soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc unix
 [39525.518000] CPU:1
 [39525.518000] EIP:0073:[08056990]Not tainted VLI
 [39525.518000] EFLAGS: 0202   (2.6.21-ga989705c #187)
 [39525.529000] EIP is at 0x8056990
 [39525.529000] eax: 6e560d60   ebx: 000b   ecx:    edx: 000dd15e
 [39525.541000] esi:    edi: 6e560220   ebp: bfeb0a58   esp: bfeb0990
 [39525.547000] ds: 007b   es: 007b   fs:   gs: 0033  ss: 007b
 [39525.553000] Process line (pid: 4277, ti=cf20 task=f6f560b0 
 task.ti=cf20)
 [39525.56] Kernel panic - not syncing: Aiee, killing interrupt handler!
 
 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-console.log
 http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-config
 

I don't know what caused the CPU to jump into hyperspace like that, but Patrick
tells me that this:

 [38773.921000] printk: 15909 messages suppressed.
 [38773.926000] ipt_hook: happy cracking.
 [38778.921000] printk: 16332 messages suppressed.
 [38778.925000] ipt_hook: happy cracking.
 [38783.921000] printk: 16175 messages suppressed.
 [38783.926000] ipt_hook: happy cracking.
 [38788.921000] printk: 16390 messages suppressed.
 [38788.925000] ipt_hook: happy cracking.
 [38793.921000] printk: 16289 messages suppressed.
 [38793.925000] ipt_hook: happy cracking.
 [38798.921000] printk: 16172 messages suppressed.
 [38798.926000] ipt_hook: happy cracking.
 [38803.921000] printk: 15738 messages suppressed.
 [38803.925000] ipt_hook: happy cracking.
 [38808.921000] printk: 14731 messages suppressed.

  happens when a local process sends packets with invalid IP headers
  through raw sockets.

  [ 5225.195000] UDP: short packet: From 37.126.206.54:46544 39671/1182
  to 127.0.0.1:40761

  This seems to indicate something on the local machine (packets are not
  routed to 127.0.0.1) is sending invalid packets, probably with
  incorrectly set up skb pointers.

  I'd suggest to add a WARN_ON(1) in ipt_local_hook().

So can you please add the appropriate WARN_ON?

Whatever happens, that printk should be toned down, shouldn't it?  We
prefer to not let unprivileged apps spam the logs.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.21-git8+ BUG: NMI Watchdog detected LOCKUP on CPU1

2007-05-08 Thread Patrick McHardy
Andrew Morton wrote:
 Whatever happens, that printk should be toned down, shouldn't it?  We
 prefer to not let unprivileged apps spam the logs.


Only priviledged apps can send these packets. I've never seen it in
practice except for one case that was a bug in the network stack, so
I'd prefer to keep it. The original intention was to give the user
a hint why these packets are not affected by iptables rules, so a
more descriptive message would make sense. I'll queue a patch for
this.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.21-git8+ BUG: NMI Watchdog detected LOCKUP on CPU1

2007-05-08 Thread Michal Piotrowski

On 08/05/07, Andrew Morton [EMAIL PROTECTED] wrote:

On Tue, 08 May 2007 10:35:14 +0200 Michal Piotrowski [EMAIL PROTECTED] wrote:

 Hi,

 / filesystem was full

 [39525.46] BUG: NMI Watchdog detected LOCKUP on CPU1, eip 08056990, 
registers:
 [39525.468000] Modules linked in: loop ipt_MASQUERADE iptable_nat nf_nat 
autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state 
nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp 
ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan 
container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss 
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss evdev snd_pcm 
intel_agp snd_timer snd agpgart soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc 
unix
 [39525.518000] CPU:1
 [39525.518000] EIP:0073:[08056990]Not tainted VLI
 [39525.518000] EFLAGS: 0202   (2.6.21-ga989705c #187)
 [39525.529000] EIP is at 0x8056990
 [39525.529000] eax: 6e560d60   ebx: 000b   ecx:    edx: 000dd15e
 [39525.541000] esi:    edi: 6e560220   ebp: bfeb0a58   esp: bfeb0990
 [39525.547000] ds: 007b   es: 007b   fs:   gs: 0033  ss: 007b
 [39525.553000] Process line (pid: 4277, ti=cf20 task=f6f560b0 
task.ti=cf20)
 [39525.56] Kernel panic - not syncing: Aiee, killing interrupt handler!

 
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-console.log
 
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-git8/git-config


I don't know what caused the CPU to jump into hyperspace like that, but Patrick
tells me that this:

 [38773.921000] printk: 15909 messages suppressed.
 [38773.926000] ipt_hook: happy cracking.
 [38778.921000] printk: 16332 messages suppressed.
 [38778.925000] ipt_hook: happy cracking.
 [38783.921000] printk: 16175 messages suppressed.
 [38783.926000] ipt_hook: happy cracking.
 [38788.921000] printk: 16390 messages suppressed.
 [38788.925000] ipt_hook: happy cracking.
 [38793.921000] printk: 16289 messages suppressed.
 [38793.925000] ipt_hook: happy cracking.
 [38798.921000] printk: 16172 messages suppressed.
 [38798.926000] ipt_hook: happy cracking.
 [38803.921000] printk: 15738 messages suppressed.
 [38803.925000] ipt_hook: happy cracking.
 [38808.921000] printk: 14731 messages suppressed.

  happens when a local process sends packets with invalid IP headers
  through raw sockets.


Yes, it was an isic session.



  [ 5225.195000] UDP: short packet: From 37.126.206.54:46544 39671/1182
  to 127.0.0.1:40761

  This seems to indicate something on the local machine (packets are not
  routed to 127.0.0.1) is sending invalid packets, probably with
  incorrectly set up skb pointers.

  I'd suggest to add a WARN_ON(1) in ipt_local_hook().

So can you please add the appropriate WARN_ON?

Whatever happens, that printk should be toned down, shouldn't it?  We
prefer to not let unprivileged apps spam the logs.




[39293.925000] ipt_hook: happy cracking.
[39429.024000] printk: 15828 messages suppressed.
[39429.028000] nf_conntrack: table full, dropping packet.
[39430.034000] nf_conntrack: table full, dropping packet.
[39431.039000] nf_conntrack: table full, dropping packet.
[39432.044000] nf_conntrack: table full, dropping packet.
[39444.056000] nf_conntrack: table full, dropping packet.
[39445.061000] nf_conntrack: table full, dropping packet.
[39525.46] BUG: NMI Watchdog detected LOCKUP on CPU1, eip
08056990, registers:

This lockup occurred after an isic test. Hmmm... linus_stress?

FAIL aio_dio_bugs Command
LD_LIBRARY_PATH=/usr/local/autotest/client/deps/libaio/lib/
/usr/local/autotest/client/tests/aio_dio_bugs/src/aio-dio-extend-stat
file failed, rc=32512
GOOD aiostress completed successfully
GOOD bonnie completed successfully
GOOD cpu_hotplug completed successfully
GOOD cyclictest completed successfully
GOOD dbench completed successfully
FAIL disktest running test disktest
--[random error]--
FAIL fs_mark Command ./fs_mark -d /mnt -s 10240 -n 1000 failed, rc=256
GOOD fsfuzzer completed successfully
GOOD fsx completed successfully
FAIL interbench Command
/usr/local/autotest/client/tests/interbench/src/interbench -m 'run
#0'  -c failed, rc=256
GOOD iozone completed successfully
FAIL isic running test job
 Traceback (most recent call last):
   File /usr/local/autotest/client/bin/job.py, line 179, in __runtest
 test.runtest(self, url, tag, args, dargs)
   File /usr/local/autotest/client/bin/test.py, line 195, in runtest
 fork_waitfor(job.resultdir, pid)
   File /usr/local/autotest/client/bin/parallel.py, line 40, in fork_waitfor
 (pid, status) = os.waitpid(pid, 0)
 KeyboardInterrupt
GOOD linus_stress completed successfully

I don't remember what was the next test.

I'll try to find out how to reproduce this lockup. Anyway, IMO it's
not a network related problem.

Regards,
Michal

--
Michal K. K. Piotrowski
Kernel Monkeys
(http://kernel.wikidot.com/start)
-
To unsubscribe from this list: