Hi,
I've been having kernel oopses with the 2.4.0-test series and am including ksymoops processed output from both test4 and test5 kernels. The same oops happens in later kernels too (Tested with test6, test7 and test8).
The scenario is this:
I have an incoming UDP stream at 1mbit. The router marks packets in this stream, according to port ranges, with 3 (or any # of) marks (via iptables v1.1.1). iproute2 builds new routing tables based on these marks, and mplsadm, with the tc patch, is called to build LSPs using these routing tables. Finally, the 3 egress LSPs are rate limited using tc (employing cbq classes) to a value less than the ingress rate (ie: I limited each LSP to 200kbit, for an aggregate egress output rate of 600kbit). When I start the traffic flowing from our generator, the box panics and freezes quite solidly. Policing via filters also crashes the box. If I move the egress rate limiting function to another box, it works okay.
I've also noted that the crash only occurs if I throttle the traffic flow to an egress rate which is less than the ingress rate (ie: ingress flow at 1mbit and egress flow at 1mbit works fine. If the egress rate is reduced, boom!)
I copied down the oopses and ran 'ksymoops < oops.txt > oops_proc.txt' and pasted them here. The first is from kernel 2.4.0-test4 and the second from 2.4.0-test5.
NEW: Here's the funny part. In mm/slab.c, the function kmem_cache_grow() contains a check as follows:
/*
* The test for missing atomic flag is performed here, rather than
* the more obvious place, simply to reduce the critical path length
* in kmem_cache_alloc(). If a caller is seriously mis-behaving they
* will eventually be caught here (where it matters).
*/
/* Commented out Sep 15 since it was crashing my router. */
/* if (in_interrupt() && (flags & SLAB_LEVEL_MASK) != SLAB_ATOMIC)
BUG(); */
This is the check that fails and causes the oops. Not understanding what is actually being checked, and not knowing the repercussions of tampering with it, I commented out the check, recompiled and reran the test. I understand that this is not really a fix (it's more akin to just turning my head and pretending that the problem doesn't exist, but... it seems to work.) The result: Great joy and much celebration! I'm throwing 7.2mbps at the box, limiting the rate to 900kbit aggregate throughput and it's working! The numbers I'm getting also seem to jive with anticipated results.
Cheers!
Jon
ksymoops 0.7c on i686 2.4.0-test4. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.0-test4/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01277fd>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 0000001b ebx: c7ffd0c0 ecx: 00000000 edx: 00000082
esi: 00000246 edi: c7ffd0c0 ebp: 00000007 esp: c024fe70
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:0, stackpage=c024f000)
Stack: c01fb794 c01fb834 00000412 c7ffd0c0 00000247 00000007 c024fed4 c7d1602e
c0127aaf c7ffd0c0 00000007 00000000 c7d170e0 c7d1602e c01eb196 00000008
00000007 00000000 c7d170e0 c7d1602e c7f8be00 00000000 c01b6aaf c7d170e0
Call trace: [<c01fb794>][<c01fb834>][<c0127aaf>][<c01eb196>][<c01b6aaf>][<c01b6c6f>][<c01b6a84>]
[<c019b1c4>][<c01b6936>][<c01b6a84>][<c019efe3>][<c011b17f>][<c010b8ee>][<c01087e0>][<c01087e0>]
[<c010a518>][<c01087e0>][<c01087e0>][<c0100018>][<c0108803>][<c0108864>][<c0105000>][<c0100192>]
Code: 0f 0b 83 c4 0c c7 44 24 10 01 00 00 00 89 ee 83 e6 07 b8 03
>>EIP; c01277fd <kmem_cache_grow+69/254> <=====
Trace; c01fb794 <tvecs+1500/14d4c>
Trace; c01fb834 <tvecs+15a0/14d4c>
Trace; c0127aaf <kmalloc+73/ac>
Trace; c01eb196 <mpls_output+12/26c>
Trace; c01b6aaf <ip_rcv_finish+2b/21c>
Trace; c01b6c6f <ip_rcv_finish+1eb/21c>
Trace; c01b6a84 <ip_rcv_finish+0/21c>
Trace; c019b1c4 <nf_hook_slow+7c/b4>
Trace; c01b6936 <ip_rcv+356/38c>
Trace; c01b6a84 <ip_rcv_finish+0/21c>
Trace; c019efe3 <net_rx_action+123/1e8>
Trace; c011b17f <do_softirq+4f/70>
Trace; c010b8ee <do_IRQ+a6/b8>
Trace; c01087e0 <default_idle+0/28>
Trace; c01087e0 <default_idle+0/28>
Trace; c010a518 <ret_from_intr+0/20>
Trace; c01087e0 <default_idle+0/28>
Trace; c01087e0 <default_idle+0/28>
Trace; c0100018 <startup_32+18/13a>
Trace; c0108803 <default_idle+23/28>
Trace; c0108864 <cpu_idle+3c/50>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c0100192 <L6+0/2>
Code; c01277fd <kmem_cache_grow+69/254>
00000000 <_EIP>:
Code; c01277fd <kmem_cache_grow+69/254> <=====
0: 0f 0b ud2a <=====
Code; c01277ff <kmem_cache_grow+6b/254>
2: 83 c4 0c add $0xc,%esp
Code; c0127802 <kmem_cache_grow+6e/254>
5: c7 44 24 10 01 00 00 movl $0x1,0x10(%esp,1)
Code; c0127809 <kmem_cache_grow+75/254>
c: 00
Code; c012780a <kmem_cache_grow+76/254>
d: 89 ee mov %ebp,%esi
Code; c012780c <kmem_cache_grow+78/254>
f: 83 e6 07 and $0x7,%esi
Code; c012780f <kmem_cache_grow+7b/254>
12: b8 03 00 00 00 mov $0x3,%eax
Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
1 warning issued. Results may not be reliable.
*********************
** Kernel 2.4.0-test5
ksymoops 0.7c on i686 2.4.0-test5. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.0-test5/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01270ed>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010282
eax: 0000001b ebx: c7ffd0c0 ecx: c02379cc edx: 00000000
esi: 00000007 edi: c7ffd0c0 ebp: c024dee8 esp: c024de80
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:0, stackpage=c024d000)
Stack: c01fa0fa c01fa19a 00000412 c7ffd0c0 00000246 00000007 c024dee8 c6702842
c670282e c012736f c7ffd0c0 00000007 00000000 c7e70140 c670282e c01e96a6
00000008 00000008 00000000 c7e70140 c670282e c7f8be00 00000000 c01b4f67
Call trace: [<c01fa0fa>][<c01fa198>][<c012736f>][<c01e96a6>][<c01b4f67>][<c01b511f>][<c0199a8c>]
[<c01b4de9>][<c01b4f3c>][<c019d833>][<c011ad5f>][<c010b84e>][<c01087e0>][<c01087e0>][<c010a478>]
[<c01087e0>][<c01087e0>][<c0100018>][<c0108803>][<c0108864>][<c0105000>][<c0100192>]
Code: 0f 0b 83 c4 0c c7 44 24 10 01 00 00 00 89 f5 83 e5 07 68 03
>>EIP; c01270ed <kmem_cache_grow+69/228> <=====
Trace; c01fa0fa <tvecs+15fe/156c4>
Trace; c01fa198 <tvecs+169c/156c4>
Trace; c012736f <kmalloc+6f/a8>
Trace; c01e96a6 <mpls_output+12/26c>
Trace; c01b4f67 <ip_rcv_finish+2b/214>
Trace; c01b511f <ip_rcv_finish+1e3/214>
Trace; c0199a8c <nf_hook_slow+c4/f8>
Trace; c01b4de9 <ip_rcv+ed/11c>
Trace; c01b4f3c <ip_rcv_finish+0/214>
Trace; c019d833 <net_rx_action+123/1e8>
Trace; c011ad5f <do_softirq+4f/70>
Trace; c010b84e <do_IRQ+a6/b8>
Trace; c01087e0 <default_idle+0/28>
Trace; c01087e0 <default_idle+0/28>
Trace; c010a478 <ret_from_intr+0/20>
Trace; c01087e0 <default_idle+0/28>
Trace; c01087e0 <default_idle+0/28>
Trace; c0100018 <startup_32+18/13a>
Trace; c0108803 <default_idle+23/28>
Trace; c0108864 <cpu_idle+3c/50>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c0100192 <L6+0/2>
Code; c01270ed <kmem_cache_grow+69/228>
00000000 <_EIP>:
Code; c01270ed <kmem_cache_grow+69/228> <=====
0: 0f 0b ud2a <=====
Code; c01270ef <kmem_cache_grow+6b/228>
2: 83 c4 0c add $0xc,%esp
Code; c01270f2 <kmem_cache_grow+6e/228>
5: c7 44 24 10 01 00 00 movl $0x1,0x10(%esp,1)
Code; c01270f9 <kmem_cache_grow+75/228>
c: 00
Code; c01270fa <kmem_cache_grow+76/228>
d: 89 f5 mov %esi,%ebp
Code; c01270fc <kmem_cache_grow+78/228>
f: 83 e5 07 and $0x7,%ebp
Code; c01270ff <kmem_cache_grow+7b/228>
12: 68 03 00 00 00 push $0x3
Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
1 warning issued. Results may not be reliable.