RE: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
Title: RE: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8 > -Original Message- > From: Andrew Morton [mailto:[EMAIL PROTECTED]] > Sent: Friday, September 15, 2000 11:47 PM > To: Earle, Jonathan [KAN:1A31:EXCH] > Cc: Linux MPLS List (E-mail); Linux Kernel List (E-mail) > Subject: Re: Kernel oops in mm/slab.c [ kmem_cache_grow() ] > with test4-8 > > > > Jonathan Earle wrote: > > > > Hi, > > > > I've been having kernel oopses with the 2.4.0-test series and am > > including ksymoops processed output from both test4 and test5 > > kernels. The same oops happens in later kernels too (Tested with > > test6, test7 and test8). > > > > Presumably mpls_output() is doing a kmalloc(..., GFP_KERNEL) > from within > a softirq. Hunt that down and turn it into GFP_ATOMIC. Okay... Did that (turned all the GFP_KERNEL refereces in net/mpls to GFP_ATOMIC, and the problem seems to have gone away, I'll post a more confident summary when I'm more sure that things are working properly. Now, what did I do (aside from fixing the problem) by changing that reference? Many thanks for the hint!! Jon
RE: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
Title: RE: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8 -Original Message- From: Andrew Morton [mailto:[EMAIL PROTECTED]] Sent: Friday, September 15, 2000 11:47 PM To: Earle, Jonathan [KAN:1A31:EXCH] Cc: Linux MPLS List (E-mail); Linux Kernel List (E-mail) Subject: Re: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8 Jonathan Earle wrote: Hi, I've been having kernel oopses with the 2.4.0-test series and am including ksymoops processed output from both test4 and test5 kernels. The same oops happens in later kernels too (Tested with test6, test7 and test8). Presumably mpls_output() is doing a kmalloc(..., GFP_KERNEL) from within a softirq. Hunt that down and turn it into GFP_ATOMIC. Okay... Did that (turned all the GFP_KERNEL refereces in net/mpls to GFP_ATOMIC, and the problem seems to have gone away, I'll post a more confident summary when I'm more sure that things are working properly. Now, what did I do (aside from fixing the problem) by changing that reference? Many thanks for the hint!! Jon
Re: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
> Jonathan Earle wrote: > > Hi, > > I've been having kernel oopses with the 2.4.0-test series and am > including ksymoops processed output from both test4 and test5 > kernels. The same oops happens in later kernels too (Tested with > test6, test7 and test8). > Presumably mpls_output() is doing a kmalloc(..., GFP_KERNEL) from within a softirq. Hunt that down and turn it into GFP_ATOMIC. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
Title: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8 Hi, I've been having kernel oopses with the 2.4.0-test series and am including ksymoops processed output from both test4 and test5 kernels. The same oops happens in later kernels too (Tested with test6, test7 and test8). The scenario is this: I have an incoming UDP stream at 1mbit. The router marks packets in this stream, according to port ranges, with 3 (or any # of) marks (via iptables v1.1.1). iproute2 builds new routing tables based on these marks, and mplsadm, with the tc patch, is called to build LSPs using these routing tables. Finally, the 3 egress LSPs are rate limited using tc (employing cbq classes) to a value less than the ingress rate (ie: I limited each LSP to 200kbit, for an aggregate egress output rate of 600kbit). When I start the traffic flowing from our generator, the box panics and freezes quite solidly. Policing via filters also crashes the box. If I move the egress rate limiting function to another box, it works okay. I've also noted that the crash only occurs if I throttle the traffic flow to an egress rate which is less than the ingress rate (ie: ingress flow at 1mbit and egress flow at 1mbit works fine. If the egress rate is reduced, boom!) I copied down the oopses and ran 'ksymoops < oops.txt > oops_proc.txt' and pasted them here. The first is from kernel 2.4.0-test4 and the second from 2.4.0-test5. NEW: Here's the funny part. In mm/slab.c, the function kmem_cache_grow() contains a check as follows: /* * The test for missing atomic flag is performed here, rather than * the more obvious place, simply to reduce the critical path length * in kmem_cache_alloc(). If a caller is seriously mis-behaving they * will eventually be caught here (where it matters). */ /* Commented out Sep 15 since it was crashing my router. */ /* if (in_interrupt() && (flags & SLAB_LEVEL_MASK) != SLAB_ATOMIC) BUG(); */ This is the check that fails and causes the oops. Not understanding what is actually being checked, and not knowing the repercussions of tampering with it, I commented out the check, recompiled and reran the test. I understand that this is not really a fix (it's more akin to just turning my head and pretending that the problem doesn't exist, but... it seems to work.) The result: Great joy and much celebration! I'm throwing 7.2mbps at the box, limiting the rate to 900kbit aggregate throughput and it's working! The numbers I'm getting also seem to jive with anticipated results. Cheers! Jon ksymoops 0.7c on i686 2.4.0-test4. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test4/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. invalid operand: CPU: 0 EIP: 0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010286 eax: 001b ebx: c7ffd0c0 ecx: edx: 0082 esi: 0246 edi: c7ffd0c0 ebp: 0007 esp: c024fe70 ds: 0018 es: 0018 ss: 0018 Process swapper (pid:0, stackpage=c024f000) Stack: c01fb794 c01fb834 0412 c7ffd0c0 0247 0007 c024fed4 c7d1602e c0127aaf c7ffd0c0 0007 c7d170e0 c7d1602e c01eb196 0008 0007 c7d170e0 c7d1602e c7f8be00 c01b6aaf c7d170e0 Call trace: [][][][][][][] [][][][][][][][] [][][][][][][][] Code: 0f 0b 83 c4 0c c7 44 24 10 01 00 00 00 89 ee 83 e6 07 b8 03 >>EIP; c01277fd <= Trace; c01fb794 Trace; c01fb834 Trace; c0127aaf Trace; c01eb196 Trace; c01b6aaf Trace; c01b6c6f Trace; c01b6a84 Trace; c019b1c4 Trace; c01b6936 Trace; c01b6a84 Trace; c019efe3 Trace; c011b17f Trace; c010b8ee Trace; c01087e0 Trace; c01087e0 Trace; c010a518 Trace; c01087e0 Trace; c01087e0 Trace; c0100018 Trace; c0108803 Trace; c0108864 Trace; c0105000 Trace; c0100192 Code; c01277fd <_EIP>: Code; c01277fd <= 0: 0f 0b ud2a <= Code; c01277ff 2: 83 c4 0c add $0xc,%esp Code; c0127802 5: c7 44 24 10 01 00 00 movl $0x1,0x10(%esp,1) Code; c0127809 c: 00 Code; c012780a d: 89 ee mov %ebp,%esi Code; c012780c f: 83 e6 07 and $0x7,%esi Code; c012780f 12: b8 03 00 00 00 mov $0x3,%eax Aiee, killing interrupt h
Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
Title: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8 Hi, I've been having kernel oopses with the 2.4.0-test series and am including ksymoops processed output from both test4 and test5 kernels. The same oops happens in later kernels too (Tested with test6, test7 and test8). The scenario is this: I have an incoming UDP stream at 1mbit. The router marks packets in this stream, according to port ranges, with 3 (or any # of) marks (via iptables v1.1.1). iproute2 builds new routing tables based on these marks, and mplsadm, with the tc patch, is called to build LSPs using these routing tables. Finally, the 3 egress LSPs are rate limited using tc (employing cbq classes) to a value less than the ingress rate (ie: I limited each LSP to 200kbit, for an aggregate egress output rate of 600kbit). When I start the traffic flowing from our generator, the box panics and freezes quite solidly. Policing via filters also crashes the box. If I move the egress rate limiting function to another box, it works okay. I've also noted that the crash only occurs if I throttle the traffic flow to an egress rate which is less than the ingress rate (ie: ingress flow at 1mbit and egress flow at 1mbit works fine. If the egress rate is reduced, boom!) I copied down the oopses and ran 'ksymoops oops.txt oops_proc.txt' and pasted them here. The first is from kernel 2.4.0-test4 and the second from 2.4.0-test5. NEW: Here's the funny part. In mm/slab.c, the function kmem_cache_grow() contains a check as follows: /* * The test for missing atomic flag is performed here, rather than * the more obvious place, simply to reduce the critical path length * in kmem_cache_alloc(). If a caller is seriously mis-behaving they * will eventually be caught here (where it matters). */ /* Commented out Sep 15 since it was crashing my router. */ /* if (in_interrupt() (flags SLAB_LEVEL_MASK) != SLAB_ATOMIC) BUG(); */ This is the check that fails and causes the oops. Not understanding what is actually being checked, and not knowing the repercussions of tampering with it, I commented out the check, recompiled and reran the test. I understand that this is not really a fix (it's more akin to just turning my head and pretending that the problem doesn't exist, but... it seems to work.) The result: Great joy and much celebration! I'm throwing 7.2mbps at the box, limiting the rate to 900kbit aggregate throughput and it's working! The numbers I'm getting also seem to jive with anticipated results. Cheers! Jon ksymoops 0.7c on i686 2.4.0-test4. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test4/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. invalid operand: CPU: 0 EIP: 0010:[c01277fd] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010286 eax: 001b ebx: c7ffd0c0 ecx: edx: 0082 esi: 0246 edi: c7ffd0c0 ebp: 0007 esp: c024fe70 ds: 0018 es: 0018 ss: 0018 Process swapper (pid:0, stackpage=c024f000) Stack: c01fb794 c01fb834 0412 c7ffd0c0 0247 0007 c024fed4 c7d1602e c0127aaf c7ffd0c0 0007 c7d170e0 c7d1602e c01eb196 0008 0007 c7d170e0 c7d1602e c7f8be00 c01b6aaf c7d170e0 Call trace: [c01fb794][c01fb834][c0127aaf][c01eb196][c01b6aaf][c01b6c6f][c01b6a84] [c019b1c4][c01b6936][c01b6a84][c019efe3][c011b17f][c010b8ee][c01087e0][c01087e0] [c010a518][c01087e0][c01087e0][c0100018][c0108803][c0108864][c0105000][c0100192] Code: 0f 0b 83 c4 0c c7 44 24 10 01 00 00 00 89 ee 83 e6 07 b8 03 EIP; c01277fd kmem_cache_grow+69/254 = Trace; c01fb794 tvecs+1500/14d4c Trace; c01fb834 tvecs+15a0/14d4c Trace; c0127aaf kmalloc+73/ac Trace; c01eb196 mpls_output+12/26c Trace; c01b6aaf ip_rcv_finish+2b/21c Trace; c01b6c6f ip_rcv_finish+1eb/21c Trace; c01b6a84 ip_rcv_finish+0/21c Trace; c019b1c4 nf_hook_slow+7c/b4 Trace; c01b6936 ip_rcv+356/38c Trace; c01b6a84 ip_rcv_finish+0/21c Trace; c019efe3 net_rx_action+123/1e8 Trace; c011b17f do_softirq+4f/70 Trace; c010b8ee do_IRQ+a6/b8 Trace; c01087e0 default_idle+0/28 Trace; c01087e0 default_idle+0/28 Trace; c010a518 ret_from_intr+0/20 Trace; c01087e0 default_idle+0/28 Trace; c01087e0 default_idle+0/28 Trace; c0100018 startup_32+18/13a Trace; c0108803 default_idle+23/28 Trace; c0108864 cpu_idle+3c/50 Trace; c0105000 empty_bad_page+0/1000 Trace; c0100192 L6+0/2 Code; c01277fd kmem_cache_grow+69/254 _EIP: Code; c01277fd kmem_cache_grow+69/254 = 0: 0f 0b ud2a = Code; c01277ff
Re: Kernel oops in mm/slab.c [ kmem_cache_grow() ] with test4-8
Jonathan Earle wrote: Hi, I've been having kernel oopses with the 2.4.0-test series and am including ksymoops processed output from both test4 and test5 kernels. The same oops happens in later kernels too (Tested with test6, test7 and test8). Presumably mpls_output() is doing a kmalloc(..., GFP_KERNEL) from within a softirq. Hunt that down and turn it into GFP_ATOMIC. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/