[1.] One line summary of the problem:

Oops killing soft interrupt in networking subsystem on plain 2.4.5 kernel.

[2.] Full description of the problem/report:

I recently updated the system on my router (486 100Mhz, 12MB RAM) to kernel 
2.4.5. The router has one ethernet interface and an ISDN dialup with dynamic 
IP address (and thus masquerading). I'm using the new netfilter code for 
firewall settings, masquerading, etc...
This is working all fine, but rarely - that is, around every 5 hours with 
heavy load, but sometimes earlier and sometimes later - the kernel locks up 
after an oops in an interrupt handler.

[3.] Keywords (i.e., modules, networking, kernel):

networking, netfilter, iptables, masquerading, free_pages

[4.] Kernel version (from /proc/version):

Linux version 2.4.5 (root@zen) (gcc version 2.95.2.1 19991024 (release)) #3 
Sun Jun 24 17:23:28 CEST 2001

[5.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/oops-tracing.txt)

ksymoops 2.4.1 on i486 2.4.5.  Options used
     -v /usr/src/linux/vmlinux (specified)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.5/ (default)
     -m /usr/src/linux/System.map (specified)

Warning (compare_maps): ksyms_base symbol 
__VERSIONED_SYMBOL(shmem_file_setup) not found in vmlinux.  Ignoring 
ksyms_base entry
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0126ae2>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 702e6e6f ebx: 00000000 ecx: 702e6e6ef edx: 00000000
esi: c06b1700 edi: c0fdb270 ebp: 000000050 esp: c023dde4
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c023d000)
Stack: c01a530d fffff400 c06b1700 c01a58f3 c06b1700 c06b1700 c0098a00 c09150b0
       c0098a00 fffffffa c01a834d c06b1700 00000002 c0045f10 c00b1700 c01ab4f3
       c06b1700 c06b1700 00000000 00000004 c01b40e0 c01b415d c06b1700 00000001
Call Trace: [<c01a530d>] [<c01a58f3>] [<c01a834d>] [<c01ab4f3>] [<c01b40e0>] 
[<c01b415d>] [<c01ac467>]
            [<c01b17b0>] [<c01b4062>] [<c01b40e0>] [<c01b17fa>] [<c01ac467>] 
[<c01b05dc>] [<c01b1754>] [<c01b17b0>]
            [<c01b09e0>] [<c01b0b69>] [<c01b09e0>] [<c01ac467>] [<c01b0826>] 
[<c01b09e0>] [<c01a88cd>] [<c01141af>]
            [<c01080b1>] [<c0105140>] [<c0106c60>] [<c0105140>] [<c0105163>] 
[<c01051c8>] [<c0105000>] [<c0100197>]
Code: 8b 41 18 85 c0 7c 11 ff 49 14 0f 94 c0 84 c0 74 07 89 c8 e8

>>EIP; c0126ae2 <__free_pages+2/20>   <=====
Trace; c01a530d <skb_release_data+3d/70>
Trace; c01a58f3 <skb_linearize+d3/140>
Trace; c01a834d <dev_queue_xmit+6d/1f0>
Trace; c01ab4f3 <neigh_resolve_output+113/190>
Trace; c01b40e0 <ip_finish_output2+0/c0>
Trace; c01b415d <ip_finish_output2+7d/c0>
Trace; c01ac467 <nf_hook_slow+e7/130>
Trace; c01b17b0 <ip_forward_finish+0/50>
Trace; c01b4062 <ip_finish_output+92/f0>
Trace; c01b40e0 <ip_finish_output2+0/c0>
Trace; c01b17fa <ip_forward_finish+4a/50>
Trace; c01ac467 <nf_hook_slow+e7/130>
Trace; c01b05dc <ip_rcv+ec/370>
Trace; c01b1754 <ip_forward+1a4/200>
Trace; c01b17b0 <ip_forward_finish+0/50>
Trace; c01b09e0 <ip_rcv_finish+0/1c0>
Trace; c01b0b69 <ip_rcv_finish+189/1c0>
Trace; c01b09e0 <ip_rcv_finish+0/1c0>
Trace; c01ac467 <nf_hook_slow+e7/130>
Trace; c01b0826 <ip_rcv+336/370>
Trace; c01b09e0 <ip_rcv_finish+0/1c0>
Trace; c01a88cd <net_rx_action+13d/220>
Trace; c01141af <do_softirq+3f/70>
Trace; c01080b1 <do_IRQ+a1/b0>
Trace; c0105140 <default_idle+0/30>
Trace; c0106c60 <ret_from_intr+0/20>
Trace; c0105140 <default_idle+0/30>
Trace; c0105163 <default_idle+23/30>
Trace; c01051c8 <cpu_idle+38/50>
Trace; c0105000 <prepare_namespace+0/10>
Trace; c0100197 <L6+0/2>
Code;  c0126ae2 <__free_pages+2/20>
00000000 <_EIP>:
Code;  c0126ae2 <__free_pages+2/20>   <=====
   0:   8b 41 18                  mov    0x18(%ecx),%eax   <=====
Code;  c0126ae5 <__free_pages+5/20>
   3:   85 c0                     test   %eax,%eax
Code;  c0126ae7 <__free_pages+7/20>
   5:   7c 11                     jl     18 <_EIP+0x18> c0126afa 
<__free_pages+1a/20>
Code;  c0126ae9 <__free_pages+9/20>
   7:   ff 49 14                  decl   0x14(%ecx)
Code;  c0126aec <__free_pages+c/20>
   a:   0f 94 c0                  sete   %al
Code;  c0126aef <__free_pages+f/20>
   d:   84 c0                     test   %al,%al
Code;  c0126af1 <__free_pages+11/20>
   f:   74 07                     je     18 <_EIP+0x18> c0126afa 
<__free_pages+1a/20>
Code;  c0126af3 <__free_pages+13/20>
  11:   89 c8                     mov    %ecx,%eax
Code;  c0126af5 <__free_pages+15/20>
  13:   e8 00 00 00 00            call   18 <_EIP+0x18> c0126afa 
<__free_pages+1a/20>
 
Kernel panic: Aiee, killing interrupt handler!
 
1 warning issued.  Results may not be reliable.

[6.] A small shell script or example program which triggers the
     problem (if possible)

I couldn't figure out what exactly (malformed packet, ... ?) causes the crash.

[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)

Linux zen 2.4.5 #3 Sun Jun 24 17:23:28 CEST 2001 i486 unknown
 
Gnu C                  2.95.2.1
Gnu make               3.79.1
binutils               2.10.1
util-linux             2.10r
mount                  2.10r
modutils               2.4.0
e2fsprogs              1.19
isdn4k-utils           3.1pre1
Linux C Library        2.2.1
Dynamic linker (ldd)   2.2.1
Procps                 2.0.7
Net-tools              1.57
Console-tools          0.2.3
Sh-utils               2.0
Modules Loaded         hisax isdn slhc ne2k-pci 8390

[7.2.] Processor information (from /proc/cpuinfo):

processor       : 0
vendor_id       : unknown
cpu family      : 4
model           : 0
model name      : 486
stepping        : unknown
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : no
cpuid level     : -1
wp              : yes
flags           :
bogomips        : 49.66

[7.3.] Module information (from /proc/modules):

hisax                 134224   3
isdn                   91568   4 [hisax]
slhc                    4832   1 [isdn]
ne2k-pci                4576   1
8390                    6288   0 [ne2k-pci]

[7.4.] SCSI information (from /proc/scsi/scsi)

no scsi devices

[7.5.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):

The output from ifconfig:
eth0      Link encap:Ethernet  HWaddr 00:50:BA:E0:CC:C8
          inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1499 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1282 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:10 Base address:0x6000
 
ippp0     Link encap:Point-to-Point Protocol
          inet addr:217.87.184.242  P-t-P:217.5.114.45  Mask:255.255.255.0
          UP POINTOPOINT RUNNING NOARP  MTU:1500  Metric:1
          RX packets:1116 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1249 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:30
 
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0

[X.] Other notes, patches, fixes, workarounds:

My iptables configuration:
NAT table has the postrouting MASQUERADE entry

The filter only filters incoming packets: everything but ESTABLISHED 
connections and some TCP SYN packets are DROPped after logging them (with a 
limit of 3 hits per minute).

I hope this will help in squishing out one more nasty bug...

cu,
Nicolai Haehnle

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to