Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`
> On Nov 24, 2016, at 5:15 AM, Michal Hockowrote: > >> * No rcu_* warnings on that machine with 4.7.2, but with 4.8.4 , 4.8.6 , >> 4.8.8 and now 4.9.0-rc5+Pauls patch > > I assume you haven't tried the Linus 4.8 kernel without any further > stable patches? Just to be sure we are not talking about some later > regression which found its way to the stable tree. We are also seeing this frequently on our fleet since moving from 4.7.x to 4.8. This is from a machine running vanilla 4.8.6 just a few moments ago: INFO: rcu_sched detected stalls on CPUs/tasks: 13-...: (420 ticks this GP) idle=ce1/140/0 softirq=225550784/225550904 fqs=87105 (detected by 26, t=600030 jiffies, g=68185325, c=68185324, q=344996) Task dump for CPU 13: kswapd1 R running task12200 1840 2 0x0808 0001 0034 012b 3139 8b643fffb000 8b028cee7cf8 8b028cee7cf8 8b028cee7d08 8b028cee7d08 8b028cee7d18 8b028cee7d18 8b02 Call Trace: [] ? shrink_node+0xcd/0x2f0 [] ? kswapd+0x304/0x710 [] ? mem_cgroup_shrink_node+0x160/0x160 [] ? kthread+0xc4/0xe0 [] ? ret_from_fork+0x1f/0x40 [] ? kthread_worker_fn+0x140/0x140 The machine will lag terribly during these occurrences .. some will eventually recover, some will spiral down and require a reboot. -Chris
Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`
> On Nov 24, 2016, at 5:15 AM, Michal Hocko wrote: > >> * No rcu_* warnings on that machine with 4.7.2, but with 4.8.4 , 4.8.6 , >> 4.8.8 and now 4.9.0-rc5+Pauls patch > > I assume you haven't tried the Linus 4.8 kernel without any further > stable patches? Just to be sure we are not talking about some later > regression which found its way to the stable tree. We are also seeing this frequently on our fleet since moving from 4.7.x to 4.8. This is from a machine running vanilla 4.8.6 just a few moments ago: INFO: rcu_sched detected stalls on CPUs/tasks: 13-...: (420 ticks this GP) idle=ce1/140/0 softirq=225550784/225550904 fqs=87105 (detected by 26, t=600030 jiffies, g=68185325, c=68185324, q=344996) Task dump for CPU 13: kswapd1 R running task12200 1840 2 0x0808 0001 0034 012b 3139 8b643fffb000 8b028cee7cf8 8b028cee7cf8 8b028cee7d08 8b028cee7d08 8b028cee7d18 8b028cee7d18 8b02 Call Trace: [] ? shrink_node+0xcd/0x2f0 [] ? kswapd+0x304/0x710 [] ? mem_cgroup_shrink_node+0x160/0x160 [] ? kthread+0xc4/0xe0 [] ? ret_from_fork+0x1f/0x40 [] ? kthread_worker_fn+0x140/0x140 The machine will lag terribly during these occurrences .. some will eventually recover, some will spiral down and require a reboot. -Chris
Re: [Feature Request?] Inline compression of process core dumps
Alan Cox wrote: Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? This looks somewhat better and might do the trick. Also fixes a very very obscure security corner case. If you change core pattern to start with the program name then the user can run a program called "|myevilhack" as it stands. The patch checks for "|" in the pattern not the output and doesn't nail a pid on to a piped name. Works great now. Queue this sucker up! # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl # ./linux Segmentation fault (core dumped) # file /tmp/dumper.out /tmp/dumper.out: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style Thanks for everyone's help. -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
Alan Cox wrote: Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? This looks somewhat better and might do the trick. Also fixes a very very obscure security corner case. If you change core pattern to start with the program name then the user can run a program called |myevilhack as it stands. The patch checks for | in the pattern not the output and doesn't nail a pid on to a piped name. snip Works great now. Queue this sucker up! # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl # ./linux blah blah Segmentation fault (core dumped) # file /tmp/dumper.out /tmp/dumper.out: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style Thanks for everyone's help. -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
Randy Dunlap wrote: On Thu, 12 Apr 2007 22:22:18 -0400 Christopher S. Aker wrote: Alan Cox wrote: > Indeed. So useful that in current kernels you can set the core dump > path to be > > "|application" Cool stuff! However, it's not working (2.6.20.6): Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed even though... # cat /proc/sys/kernel/core_uses_pid 0 # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? What are you trying to dump? is it a multi-thread group app, not a "simple" app? I ask because of this (I'm looking at 2.6.21-rc6) reference (not that I know what that is): if (!pid_in_pattern && (core_uses_pid || atomic_read(>mm->mm_users) != 1)) { rc = snprintf(out_ptr, out_end - out_ptr, ".%d", current->tgid); if (rc > out_end - out_ptr) goto out; out_ptr += rc; } I saw that too, and unfortunately I don't know what what that condition represents, either. It's the only other element in that if statement that could make it take that path, so I'm assuming that's part of the problem. The process is a UML instance (skas mode, so at least a kernel, userspace, and io thread), which will generate a single, usable, core file just fine with a non-pipe core_pattern... -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
Alan Cox wrote: > Indeed. So useful that in current kernels you can set the core dump > path to be > >"|application" Cool stuff! However, it's not working (2.6.20.6): Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed even though... # cat /proc/sys/kernel/core_uses_pid 0 # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Feature Request?] Inline compression of process core dumps
I've been trying to find a method for compressing process core dumps before they hit disk. I ask because we've got some fairly large UML processes (1GB for some), and we're trying to capture dumps to help Jeff debug an evasive bug. Our systems use a small root partition and most of the other disk resources on the host are allocated towards the UMLs. There are userspace solutions to this problem: allowing the uncompressed core dump to spin out to disk and then coming in afterwards and doing the compression, or maybe even a compressed filesystem where the core dumps land, but I just thought I'd throw this out there since it seems it would be a useful feature :) Thanks, -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Feature Request?] Inline compression of process core dumps
I've been trying to find a method for compressing process core dumps before they hit disk. I ask because we've got some fairly large UML processes (1GB for some), and we're trying to capture dumps to help Jeff debug an evasive bug. Our systems use a small root partition and most of the other disk resources on the host are allocated towards the UMLs. There are userspace solutions to this problem: allowing the uncompressed core dump to spin out to disk and then coming in afterwards and doing the compression, or maybe even a compressed filesystem where the core dumps land, but I just thought I'd throw this out there since it seems it would be a useful feature :) Thanks, -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
Alan Cox wrote: Indeed. So useful that in current kernels you can set the core dump path to be |application Cool stuff! However, it's not working (2.6.20.6): Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed even though... # cat /proc/sys/kernel/core_uses_pid 0 # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
Randy Dunlap wrote: On Thu, 12 Apr 2007 22:22:18 -0400 Christopher S. Aker wrote: Alan Cox wrote: Indeed. So useful that in current kernels you can set the core dump path to be |application Cool stuff! However, it's not working (2.6.20.6): Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed even though... # cat /proc/sys/kernel/core_uses_pid 0 # cat /proc/sys/kernel/core_pattern |/home/caker/bin/dumper.pl Looking at the code, it seems to me that format_corename() is appending .pid, regardless if !core_uses_pid and corename[0]=='|', in which case it creates an invalid path for call_usermodehelper_pipe(). Bug in the code, or bug in my methods? What are you trying to dump? is it a multi-thread group app, not a simple app? I ask because of this (I'm looking at 2.6.21-rc6) mm_users reference (not that I know what that is): if (!pid_in_pattern (core_uses_pid || atomic_read(current-mm-mm_users) != 1)) { rc = snprintf(out_ptr, out_end - out_ptr, .%d, current-tgid); if (rc out_end - out_ptr) goto out; out_ptr += rc; } I saw that too, and unfortunately I don't know what what that condition represents, either. It's the only other element in that if statement that could make it take that path, so I'm assuming that's part of the problem. The process is a UML instance (skas mode, so at least a kernel, userspace, and io thread), which will generate a single, usable, core file just fine with a non-pipe core_pattern... -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36
Christopher S. Aker wrote: Patrick McHardy wrote: I'm trying to reproduce this (without success so far), please send your kernel config and your ebtables script. You could try if 2.6.19 works, there were some ebtables changes in 2.6.19.1 that touched this code. We're hitting this too, on both 2.6.16.36 and 2.6.19.1. BUG: unable to handle kernel paging request at virtual address f8cec008 printing eip: c0462272 *pde = Oops: [#1] SMP Modules linked in: e1000 CPU:1 EIP:0060:[]Not tainted VLI EFLAGS: 00010286 (2.6.19.1-1-bigmem #1) EIP is at translate_table+0x2b3/0xddf eax: f8ce2000 ebx: 0004 ecx: f6d53e90 edx: f8ce2000 esi: f8cebfa0 edi: 000e ebp: esp: f6d53e08 ds: 007b es: 007b ss: 0068 Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000) Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028 0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90 8000 0004 0014 0014 0600 Call Trace: [] do_replace+0x113/0x6da [] get_page_from_freelist+0x8c/0xa8 [] do_ebt_set_ctl+0x2d/0x2e [] nf_sockopt+0xfa/0xfc [] nf_setsockopt+0x23/0x2b [] ip_setsockopt+0x86/0x91 [] sock_common_setsockopt+0x23/0x2f [] sys_setsockopt+0x61/0xac [] sys_socketcall+0x1e9/0x249 [] do_page_fault+0x0/0x664 [] sysenter_past_esp+0x56/0x79 [] svc_recv+0x9c/0x3f5 === Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 <8b> 56 68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7 EIP: [] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08 Unable to handle kernel paging request at virtual address f8a3b00c printing eip: c03cce45 *pde = Oops: [#13] SMP Modules linked in: e1000 CPU:1 EIP:0060:[]Not tainted VLI EFLAGS: 00010246 (2.6.16.36-1-bigmem #1) EIP is at translate_table+0x47b/0xfc2 eax: d8fbbc3c ebx: 0098 ecx: c049b780 edx: esi: f8a3afa0 edi: 000e ebp: 0001 esp: d8fbbb7c ds: 007b es: 007b ss: 0068 Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550) Stack: <0>c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 d30f4260 00d2 d8fba000 d30f4240 d8fba000 0028 0004 0004 0fa0 0fd0 f8a8e000 f8a38000 Call Trace: [] do_replace+0x16b/0x887 [] copy_everything_to_user+0x21a/0x35c [] do_ebt_set_ctl+0x40/0x42 [] nf_sockopt+0x11f/0x121 [] nf_setsockopt+0x37/0x3b [] ip_setsockopt+0x3f9/0xb0e [] nf_sockopt+0xad/0x121 [] nf_getsockopt+0x37/0x3b [] ip_getsockopt+0x5bd/0x62b [] current_fs_time+0x5d/0x78 [] touch_atime+0x7d/0xcd [] zap_pte_range+0xf1/0x316 [] unmap_page_range+0x103/0x174 [] prio_tree_remove+0x77/0xe7 [] buffered_rmqueue+0x155/0x209 [] buffered_rmqueue+0x155/0x209 [] get_page_from_freelist+0x8c/0xa6 [] get_page_from_freelist+0x8c/0xa6 [] __alloc_pages+0x56/0x309 [] page_add_file_rmap+0x2a/0x2c [] do_anonymous_page+0x122/0x22a [] __handle_mm_fault+0x138/0x326 [] sock_common_setsockopt+0x33/0x37 [] sys_setsockopt+0x6c/0xb2 [] sys_socketcall+0x1f4/0x254 [] do_page_fault+0x0/0x630 [] sysenter_past_esp+0x54/0x75 Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 <8b> 4e 6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00 It seems to happen when flushing a user-defined ebtable, or removing a rule -- but not every time. It leaves the ebtable userspace process in D state on 2.6.19.1 but not on 2.6.16.36 (?). Considering I've never had these problems before, and that both stable (2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to guess that it's something that went into both of them very recently. Just a follow-up -- this doesn't happen with 2.6.19. -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36
Christopher S. Aker wrote: Patrick McHardy wrote: I'm trying to reproduce this (without success so far), please send your kernel config and your ebtables script. You could try if 2.6.19 works, there were some ebtables changes in 2.6.19.1 that touched this code. We're hitting this too, on both 2.6.16.36 and 2.6.19.1. BUG: unable to handle kernel paging request at virtual address f8cec008 printing eip: c0462272 *pde = Oops: [#1] SMP Modules linked in: e1000 CPU:1 EIP:0060:[c0462272]Not tainted VLI EFLAGS: 00010286 (2.6.19.1-1-bigmem #1) EIP is at translate_table+0x2b3/0xddf eax: f8ce2000 ebx: 0004 ecx: f6d53e90 edx: f8ce2000 esi: f8cebfa0 edi: 000e ebp: esp: f6d53e08 ds: 007b es: 007b ss: 0068 Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000) Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028 0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90 8000 0004 0014 0014 0600 Call Trace: [c0462f5f] do_replace+0x113/0x6da [c0142267] get_page_from_freelist+0x8c/0xa8 [c0463f4c] do_ebt_set_ctl+0x2d/0x2e [c03efbc2] nf_sockopt+0xfa/0xfc [c03efbe7] nf_setsockopt+0x23/0x2b [c03fac35] ip_setsockopt+0x86/0x91 [c03d54ef] sock_common_setsockopt+0x23/0x2f [c03d2d69] sys_setsockopt+0x61/0xac [c03d33f3] sys_socketcall+0x1e9/0x249 [c0114348] do_page_fault+0x0/0x664 [c0102bc5] sysenter_past_esp+0x56/0x79 [c047007b] svc_recv+0x9c/0x3f5 === Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 8b 56 68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7 EIP: [c0462272] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08 Unable to handle kernel paging request at virtual address f8a3b00c printing eip: c03cce45 *pde = Oops: [#13] SMP Modules linked in: e1000 CPU:1 EIP:0060:[c03cce45]Not tainted VLI EFLAGS: 00010246 (2.6.16.36-1-bigmem #1) EIP is at translate_table+0x47b/0xfc2 eax: d8fbbc3c ebx: 0098 ecx: c049b780 edx: esi: f8a3afa0 edi: 000e ebp: 0001 esp: d8fbbb7c ds: 007b es: 007b ss: 0068 Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550) Stack: 0c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 d30f4260 00d2 d8fba000 d30f4240 d8fba000 0028 0004 0004 0fa0 0fd0 f8a8e000 f8a38000 Call Trace: [c03cdbd0] do_replace+0x16b/0x887 [c03ced74] copy_everything_to_user+0x21a/0x35c [c03ceef6] do_ebt_set_ctl+0x40/0x42 [c0354ee0] nf_sockopt+0x11f/0x121 [c0354f19] nf_setsockopt+0x37/0x3b [c0360b14] ip_setsockopt+0x3f9/0xb0e [c0354e6e] nf_sockopt+0xad/0x121 [c0354f54] nf_getsockopt+0x37/0x3b [c03617e6] ip_getsockopt+0x5bd/0x62b [c012360e] current_fs_time+0x5d/0x78 [c0178813] touch_atime+0x7d/0xcd [c014b366] zap_pte_range+0xf1/0x316 [c014b68e] unmap_page_range+0x103/0x174 [c02228a7] prio_tree_remove+0x77/0xe7 [c014358c] buffered_rmqueue+0x155/0x209 [c014358c] buffered_rmqueue+0x155/0x209 [c014376e] get_page_from_freelist+0x8c/0xa6 [c014376e] get_page_from_freelist+0x8c/0xa6 [c01437de] __alloc_pages+0x56/0x309 [c015274c] page_add_file_rmap+0x2a/0x2c [c014d48d] do_anonymous_page+0x122/0x22a [c014dabd] __handle_mm_fault+0x138/0x326 [c03391e6] sock_common_setsockopt+0x33/0x37 [c0336c88] sys_setsockopt+0x6c/0xb2 [c033739a] sys_socketcall+0x1f4/0x254 [c01160e5] do_page_fault+0x0/0x630 [c0102c7f] sysenter_past_esp+0x54/0x75 Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 8b 4e 6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00 It seems to happen when flushing a user-defined ebtable, or removing a rule -- but not every time. It leaves the ebtable userspace process in D state on 2.6.19.1 but not on 2.6.16.36 (?). Considering I've never had these problems before, and that both stable (2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to guess that it's something that went into both of them very recently. Just a follow-up -- this doesn't happen with 2.6.19. -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36
Patrick McHardy wrote: I'm trying to reproduce this (without success so far), please send your kernel config and your ebtables script. You could try if 2.6.19 works, there were some ebtables changes in 2.6.19.1 that touched this code. We're hitting this too, on both 2.6.16.36 and 2.6.19.1. BUG: unable to handle kernel paging request at virtual address f8cec008 printing eip: c0462272 *pde = Oops: [#1] SMP Modules linked in: e1000 CPU:1 EIP:0060:[]Not tainted VLI EFLAGS: 00010286 (2.6.19.1-1-bigmem #1) EIP is at translate_table+0x2b3/0xddf eax: f8ce2000 ebx: 0004 ecx: f6d53e90 edx: f8ce2000 esi: f8cebfa0 edi: 000e ebp: esp: f6d53e08 ds: 007b es: 007b ss: 0068 Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000) Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028 0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90 8000 0004 0014 0014 0600 Call Trace: [] do_replace+0x113/0x6da [] get_page_from_freelist+0x8c/0xa8 [] do_ebt_set_ctl+0x2d/0x2e [] nf_sockopt+0xfa/0xfc [] nf_setsockopt+0x23/0x2b [] ip_setsockopt+0x86/0x91 [] sock_common_setsockopt+0x23/0x2f [] sys_setsockopt+0x61/0xac [] sys_socketcall+0x1e9/0x249 [] do_page_fault+0x0/0x664 [] sysenter_past_esp+0x56/0x79 [] svc_recv+0x9c/0x3f5 === Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 <8b> 56 68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7 EIP: [] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08 Unable to handle kernel paging request at virtual address f8a3b00c printing eip: c03cce45 *pde = Oops: [#13] SMP Modules linked in: e1000 CPU:1 EIP:0060:[]Not tainted VLI EFLAGS: 00010246 (2.6.16.36-1-bigmem #1) EIP is at translate_table+0x47b/0xfc2 eax: d8fbbc3c ebx: 0098 ecx: c049b780 edx: esi: f8a3afa0 edi: 000e ebp: 0001 esp: d8fbbb7c ds: 007b es: 007b ss: 0068 Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550) Stack: <0>c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 d30f4260 00d2 d8fba000 d30f4240 d8fba000 0028 0004 0004 0fa0 0fd0 f8a8e000 f8a38000 Call Trace: [] do_replace+0x16b/0x887 [] copy_everything_to_user+0x21a/0x35c [] do_ebt_set_ctl+0x40/0x42 [] nf_sockopt+0x11f/0x121 [] nf_setsockopt+0x37/0x3b [] ip_setsockopt+0x3f9/0xb0e [] nf_sockopt+0xad/0x121 [] nf_getsockopt+0x37/0x3b [] ip_getsockopt+0x5bd/0x62b [] current_fs_time+0x5d/0x78 [] touch_atime+0x7d/0xcd [] zap_pte_range+0xf1/0x316 [] unmap_page_range+0x103/0x174 [] prio_tree_remove+0x77/0xe7 [] buffered_rmqueue+0x155/0x209 [] buffered_rmqueue+0x155/0x209 [] get_page_from_freelist+0x8c/0xa6 [] get_page_from_freelist+0x8c/0xa6 [] __alloc_pages+0x56/0x309 [] page_add_file_rmap+0x2a/0x2c [] do_anonymous_page+0x122/0x22a [] __handle_mm_fault+0x138/0x326 [] sock_common_setsockopt+0x33/0x37 [] sys_setsockopt+0x6c/0xb2 [] sys_socketcall+0x1f4/0x254 [] do_page_fault+0x0/0x630 [] sysenter_past_esp+0x54/0x75 Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 <8b> 4e 6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00 It seems to happen when flushing a user-defined ebtable, or removing a rule -- but not every time. It leaves the ebtable userspace process in D state on 2.6.19.1 but not on 2.6.16.36 (?). Considering I've never had these problems before, and that both stable (2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to guess that it's something that went into both of them very recently. -Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36
Patrick McHardy wrote: I'm trying to reproduce this (without success so far), please send your kernel config and your ebtables script. You could try if 2.6.19 works, there were some ebtables changes in 2.6.19.1 that touched this code. We're hitting this too, on both 2.6.16.36 and 2.6.19.1. BUG: unable to handle kernel paging request at virtual address f8cec008 printing eip: c0462272 *pde = Oops: [#1] SMP Modules linked in: e1000 CPU:1 EIP:0060:[c0462272]Not tainted VLI EFLAGS: 00010286 (2.6.19.1-1-bigmem #1) EIP is at translate_table+0x2b3/0xddf eax: f8ce2000 ebx: 0004 ecx: f6d53e90 edx: f8ce2000 esi: f8cebfa0 edi: 000e ebp: esp: f6d53e08 ds: 007b es: 007b ss: 0068 Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000) Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028 0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90 8000 0004 0014 0014 0600 Call Trace: [c0462f5f] do_replace+0x113/0x6da [c0142267] get_page_from_freelist+0x8c/0xa8 [c0463f4c] do_ebt_set_ctl+0x2d/0x2e [c03efbc2] nf_sockopt+0xfa/0xfc [c03efbe7] nf_setsockopt+0x23/0x2b [c03fac35] ip_setsockopt+0x86/0x91 [c03d54ef] sock_common_setsockopt+0x23/0x2f [c03d2d69] sys_setsockopt+0x61/0xac [c03d33f3] sys_socketcall+0x1e9/0x249 [c0114348] do_page_fault+0x0/0x664 [c0102bc5] sysenter_past_esp+0x56/0x79 [c047007b] svc_recv+0x9c/0x3f5 === Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 8b 56 68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7 EIP: [c0462272] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08 Unable to handle kernel paging request at virtual address f8a3b00c printing eip: c03cce45 *pde = Oops: [#13] SMP Modules linked in: e1000 CPU:1 EIP:0060:[c03cce45]Not tainted VLI EFLAGS: 00010246 (2.6.16.36-1-bigmem #1) EIP is at translate_table+0x47b/0xfc2 eax: d8fbbc3c ebx: 0098 ecx: c049b780 edx: esi: f8a3afa0 edi: 000e ebp: 0001 esp: d8fbbb7c ds: 007b es: 007b ss: 0068 Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550) Stack: 0c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 d30f4260 00d2 d8fba000 d30f4240 d8fba000 0028 0004 0004 0fa0 0fd0 f8a8e000 f8a38000 Call Trace: [c03cdbd0] do_replace+0x16b/0x887 [c03ced74] copy_everything_to_user+0x21a/0x35c [c03ceef6] do_ebt_set_ctl+0x40/0x42 [c0354ee0] nf_sockopt+0x11f/0x121 [c0354f19] nf_setsockopt+0x37/0x3b [c0360b14] ip_setsockopt+0x3f9/0xb0e [c0354e6e] nf_sockopt+0xad/0x121 [c0354f54] nf_getsockopt+0x37/0x3b [c03617e6] ip_getsockopt+0x5bd/0x62b [c012360e] current_fs_time+0x5d/0x78 [c0178813] touch_atime+0x7d/0xcd [c014b366] zap_pte_range+0xf1/0x316 [c014b68e] unmap_page_range+0x103/0x174 [c02228a7] prio_tree_remove+0x77/0xe7 [c014358c] buffered_rmqueue+0x155/0x209 [c014358c] buffered_rmqueue+0x155/0x209 [c014376e] get_page_from_freelist+0x8c/0xa6 [c014376e] get_page_from_freelist+0x8c/0xa6 [c01437de] __alloc_pages+0x56/0x309 [c015274c] page_add_file_rmap+0x2a/0x2c [c014d48d] do_anonymous_page+0x122/0x22a [c014dabd] __handle_mm_fault+0x138/0x326 [c03391e6] sock_common_setsockopt+0x33/0x37 [c0336c88] sys_setsockopt+0x6c/0xb2 [c033739a] sys_socketcall+0x1f4/0x254 [c01160e5] do_page_fault+0x0/0x630 [c0102c7f] sysenter_past_esp+0x54/0x75 Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 8b 4e 6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00 It seems to happen when flushing a user-defined ebtable, or removing a rule -- but not every time. It leaves the ebtable userspace process in D state on 2.6.19.1 but not on 2.6.16.36 (?). Considering I've never had these problems before, and that both stable (2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to guess that it's something that went into both of them very recently. -Chris - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/