Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`

2016-11-26 Thread Christopher S. Aker

> On Nov 24, 2016, at 5:15 AM, Michal Hocko  wrote:
> 
>> * No rcu_* warnings on that machine with 4.7.2, but with 4.8.4 , 4.8.6 ,
>> 4.8.8 and now 4.9.0-rc5+Pauls patch
> 
> I assume you haven't tried the Linus 4.8 kernel without any further
> stable patches? Just to be sure we are not talking about some later
> regression which found its way to the stable tree.

We are also seeing this frequently on our fleet since moving from 4.7.x to 4.8. 
This is from a machine running vanilla 4.8.6 just a few moments ago:

INFO: rcu_sched detected stalls on CPUs/tasks:
13-...: (420 ticks this GP) idle=ce1/140/0 
softirq=225550784/225550904 fqs=87105 
(detected by 26, t=600030 jiffies, g=68185325, c=68185324, q=344996)
Task dump for CPU 13:
kswapd1 R  running task12200  1840  2 0x0808
 0001 0034 012b 3139
 8b643fffb000 8b028cee7cf8 8b028cee7cf8 8b028cee7d08
 8b028cee7d08 8b028cee7d18 8b028cee7d18 8b02
Call Trace:
 [] ? shrink_node+0xcd/0x2f0
 [] ? kswapd+0x304/0x710
 [] ? mem_cgroup_shrink_node+0x160/0x160
 [] ? kthread+0xc4/0xe0
 [] ? ret_from_fork+0x1f/0x40
 [] ? kthread_worker_fn+0x140/0x140

The machine will lag terribly during these occurrences .. some will eventually 
recover, some will spiral down and require a reboot.

-Chris



Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`

2016-11-26 Thread Christopher S. Aker

> On Nov 24, 2016, at 5:15 AM, Michal Hocko  wrote:
> 
>> * No rcu_* warnings on that machine with 4.7.2, but with 4.8.4 , 4.8.6 ,
>> 4.8.8 and now 4.9.0-rc5+Pauls patch
> 
> I assume you haven't tried the Linus 4.8 kernel without any further
> stable patches? Just to be sure we are not talking about some later
> regression which found its way to the stable tree.

We are also seeing this frequently on our fleet since moving from 4.7.x to 4.8. 
This is from a machine running vanilla 4.8.6 just a few moments ago:

INFO: rcu_sched detected stalls on CPUs/tasks:
13-...: (420 ticks this GP) idle=ce1/140/0 
softirq=225550784/225550904 fqs=87105 
(detected by 26, t=600030 jiffies, g=68185325, c=68185324, q=344996)
Task dump for CPU 13:
kswapd1 R  running task12200  1840  2 0x0808
 0001 0034 012b 3139
 8b643fffb000 8b028cee7cf8 8b028cee7cf8 8b028cee7d08
 8b028cee7d08 8b028cee7d18 8b028cee7d18 8b02
Call Trace:
 [] ? shrink_node+0xcd/0x2f0
 [] ? kswapd+0x304/0x710
 [] ? mem_cgroup_shrink_node+0x160/0x160
 [] ? kthread+0xc4/0xe0
 [] ? ret_from_fork+0x1f/0x40
 [] ? kthread_worker_fn+0x140/0x140

The machine will lag terribly during these occurrences .. some will eventually 
recover, some will spiral down and require a reboot.

-Chris



Re: [Feature Request?] Inline compression of process core dumps

2007-04-13 Thread Christopher S. Aker

Alan Cox wrote:
Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?


This looks somewhat better and might do the trick. Also fixes a very very
obscure security corner case. If you change core pattern to start with
the program name then the user can run a program called "|myevilhack" as
it stands. The patch checks for "|" in the pattern not the output and
doesn't nail a pid on to a piped name.




Works great now.  Queue this sucker up!

# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl
# ./linux

Segmentation fault (core dumped)
# file /tmp/dumper.out
	/tmp/dumper.out: ELF 32-bit LSB core file Intel 80386, version 1 
(SYSV), SVR4-style


Thanks for everyone's help.

-Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Feature Request?] Inline compression of process core dumps

2007-04-13 Thread Christopher S. Aker

Alan Cox wrote:
Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?


This looks somewhat better and might do the trick. Also fixes a very very
obscure security corner case. If you change core pattern to start with
the program name then the user can run a program called |myevilhack as
it stands. The patch checks for | in the pattern not the output and
doesn't nail a pid on to a piped name.


snip

Works great now.  Queue this sucker up!

# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl
# ./linux
blah blah
Segmentation fault (core dumped)
# file /tmp/dumper.out
	/tmp/dumper.out: ELF 32-bit LSB core file Intel 80386, version 1 
(SYSV), SVR4-style


Thanks for everyone's help.

-Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker

Randy Dunlap wrote:

On Thu, 12 Apr 2007 22:22:18 -0400 Christopher S. Aker wrote:


Alan Cox wrote:
 > Indeed. So useful that in current kernels you can set the core dump
 > path to be
 >
 >   "|application"

Cool stuff!  However, it's not working (2.6.20.6):

Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed

even though...

# cat /proc/sys/kernel/core_uses_pid
0
# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl

Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?


What are you trying to dump?  is it a multi-thread group app,
not a "simple" app?  I ask because of this (I'm looking at 2.6.21-rc6)
 reference (not that I know what that is):

if (!pid_in_pattern
&& (core_uses_pid || atomic_read(>mm->mm_users) != 1)) {
rc = snprintf(out_ptr, out_end - out_ptr,
  ".%d", current->tgid);
if (rc > out_end - out_ptr)
goto out;
out_ptr += rc;
}


I saw that too, and unfortunately I don't know what what that condition 
represents, either.  It's the only other element in that if statement 
that could make it take that path, so I'm assuming that's part of the 
problem.


The process is a UML instance (skas mode, so at least a kernel, 
userspace, and io thread), which will generate a single, usable, core 
file just fine with a non-pipe core_pattern...


-Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker

Alan Cox wrote:
> Indeed. So useful that in current kernels you can set the core dump
> path to be
>
>"|application"

Cool stuff!  However, it's not working (2.6.20.6):

Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed

even though...

# cat /proc/sys/kernel/core_uses_pid
0
# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl

Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?

-Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker
I've been trying to find a method for compressing process core dumps 
before they hit disk.


I ask because we've got some fairly large UML processes (1GB for some), 
and we're trying to capture dumps to help Jeff debug an evasive bug. 
Our systems use a small root partition and most of the other disk 
resources on the host are allocated towards the UMLs.


There are userspace solutions to this problem:  allowing the 
uncompressed core dump to spin out to disk and then coming in afterwards 
and doing the compression, or maybe even a compressed filesystem where 
the core dumps land, but I just thought I'd throw this out there since 
it seems it would be a useful feature :)


Thanks,
-Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker
I've been trying to find a method for compressing process core dumps 
before they hit disk.


I ask because we've got some fairly large UML processes (1GB for some), 
and we're trying to capture dumps to help Jeff debug an evasive bug. 
Our systems use a small root partition and most of the other disk 
resources on the host are allocated towards the UMLs.


There are userspace solutions to this problem:  allowing the 
uncompressed core dump to spin out to disk and then coming in afterwards 
and doing the compression, or maybe even a compressed filesystem where 
the core dumps land, but I just thought I'd throw this out there since 
it seems it would be a useful feature :)


Thanks,
-Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker

Alan Cox wrote:
 Indeed. So useful that in current kernels you can set the core dump
 path to be

|application

Cool stuff!  However, it's not working (2.6.20.6):

Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed

even though...

# cat /proc/sys/kernel/core_uses_pid
0
# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl

Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?

-Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Feature Request?] Inline compression of process core dumps

2007-04-12 Thread Christopher S. Aker

Randy Dunlap wrote:

On Thu, 12 Apr 2007 22:22:18 -0400 Christopher S. Aker wrote:


Alan Cox wrote:
  Indeed. So useful that in current kernels you can set the core dump
  path to be
 
|application

Cool stuff!  However, it's not working (2.6.20.6):

Core dump to |/home/caker/bin/dumper.pl.4442 pipe failed

even though...

# cat /proc/sys/kernel/core_uses_pid
0
# cat /proc/sys/kernel/core_pattern
|/home/caker/bin/dumper.pl

Looking at the code, it seems to me that format_corename() is appending 
.pid, regardless if !core_uses_pid and corename[0]=='|', in which case 
it creates an invalid path for call_usermodehelper_pipe().


Bug in the code, or bug in my methods?


What are you trying to dump?  is it a multi-thread group app,
not a simple app?  I ask because of this (I'm looking at 2.6.21-rc6)
mm_users reference (not that I know what that is):

if (!pid_in_pattern
 (core_uses_pid || atomic_read(current-mm-mm_users) != 1)) {
rc = snprintf(out_ptr, out_end - out_ptr,
  .%d, current-tgid);
if (rc  out_end - out_ptr)
goto out;
out_ptr += rc;
}


I saw that too, and unfortunately I don't know what what that condition 
represents, either.  It's the only other element in that if statement 
that could make it take that path, so I'm assuming that's part of the 
problem.


The process is a UML instance (skas mode, so at least a kernel, 
userspace, and io thread), which will generate a single, usable, core 
file just fine with a non-pipe core_pattern...


-Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36

2006-12-24 Thread Christopher S. Aker

Christopher S. Aker wrote:

Patrick McHardy wrote:

I'm trying to reproduce this (without success so far), please send your
kernel config and your ebtables script.

You could try if 2.6.19 works, there were some ebtables changes in
2.6.19.1 that touched this code.


We're hitting this too, on both 2.6.16.36 and 2.6.19.1.

BUG: unable to handle kernel paging request at virtual address f8cec008
 printing eip:
c0462272
*pde = 
Oops:  [#1]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.19.1-1-bigmem #1)
EIP is at translate_table+0x2b3/0xddf
eax: f8ce2000   ebx: 0004   ecx: f6d53e90   edx: f8ce2000
esi: f8cebfa0   edi: 000e   ebp:    esp: f6d53e08
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000)
Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028  

   0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90  
8000
      0004 0014  0014 
0600

Call Trace:
 [] do_replace+0x113/0x6da
 [] get_page_from_freelist+0x8c/0xa8
 [] do_ebt_set_ctl+0x2d/0x2e
 [] nf_sockopt+0xfa/0xfc
 [] nf_setsockopt+0x23/0x2b
 [] ip_setsockopt+0x86/0x91
 [] sock_common_setsockopt+0x23/0x2f
 [] sys_setsockopt+0x61/0xac
 [] sys_socketcall+0x1e9/0x249
 [] do_page_fault+0x0/0x664
 [] sysenter_past_esp+0x56/0x79
 [] svc_recv+0x9c/0x3f5
 ===
Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 
5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 <8b> 56 
68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7

EIP: [] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08


Unable to handle kernel paging request at virtual address f8a3b00c
 printing eip:
c03cce45
*pde = 
Oops:  [#13]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.16.36-1-bigmem #1)
EIP is at translate_table+0x47b/0xfc2
eax: d8fbbc3c   ebx: 0098   ecx: c049b780   edx: 
esi: f8a3afa0   edi: 000e   ebp: 0001   esp: d8fbbb7c
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550)
Stack: <0>c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 
d30f4260
   00d2 d8fba000 d30f4240 d8fba000 0028 0004  
0004
    0fa0 0fd0 f8a8e000  f8a38000  


Call Trace:
 [] do_replace+0x16b/0x887
 [] copy_everything_to_user+0x21a/0x35c
 [] do_ebt_set_ctl+0x40/0x42
 [] nf_sockopt+0x11f/0x121
 [] nf_setsockopt+0x37/0x3b
 [] ip_setsockopt+0x3f9/0xb0e
 [] nf_sockopt+0xad/0x121
 [] nf_getsockopt+0x37/0x3b
 [] ip_getsockopt+0x5bd/0x62b
 [] current_fs_time+0x5d/0x78
 [] touch_atime+0x7d/0xcd
 [] zap_pte_range+0xf1/0x316
 [] unmap_page_range+0x103/0x174
 [] prio_tree_remove+0x77/0xe7
 [] buffered_rmqueue+0x155/0x209
 [] buffered_rmqueue+0x155/0x209
 [] get_page_from_freelist+0x8c/0xa6
 [] get_page_from_freelist+0x8c/0xa6
 [] __alloc_pages+0x56/0x309
 [] page_add_file_rmap+0x2a/0x2c
 [] do_anonymous_page+0x122/0x22a
 [] __handle_mm_fault+0x138/0x326
 [] sock_common_setsockopt+0x33/0x37
 [] sys_setsockopt+0x6c/0xb2
 [] sys_socketcall+0x1f4/0x254
 [] do_page_fault+0x0/0x630
 [] sysenter_past_esp+0x54/0x75
Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 
44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 <8b> 4e 
6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00



It seems to happen when flushing a user-defined ebtable, or removing a 
rule -- but not every time. It leaves the ebtable userspace process in D 
state on 2.6.19.1 but not on 2.6.16.36 (?).


Considering I've never had these problems before, and that both stable 
(2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to 
guess that it's something that went into both of them very recently.


Just a follow-up -- this doesn't happen with 2.6.19.

-Chris


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36

2006-12-24 Thread Christopher S. Aker

Christopher S. Aker wrote:

Patrick McHardy wrote:

I'm trying to reproduce this (without success so far), please send your
kernel config and your ebtables script.

You could try if 2.6.19 works, there were some ebtables changes in
2.6.19.1 that touched this code.


We're hitting this too, on both 2.6.16.36 and 2.6.19.1.

BUG: unable to handle kernel paging request at virtual address f8cec008
 printing eip:
c0462272
*pde = 
Oops:  [#1]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[c0462272]Not tainted VLI
EFLAGS: 00010286   (2.6.19.1-1-bigmem #1)
EIP is at translate_table+0x2b3/0xddf
eax: f8ce2000   ebx: 0004   ecx: f6d53e90   edx: f8ce2000
esi: f8cebfa0   edi: 000e   ebp:    esp: f6d53e08
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000)
Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028  

   0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90  
8000
      0004 0014  0014 
0600

Call Trace:
 [c0462f5f] do_replace+0x113/0x6da
 [c0142267] get_page_from_freelist+0x8c/0xa8
 [c0463f4c] do_ebt_set_ctl+0x2d/0x2e
 [c03efbc2] nf_sockopt+0xfa/0xfc
 [c03efbe7] nf_setsockopt+0x23/0x2b
 [c03fac35] ip_setsockopt+0x86/0x91
 [c03d54ef] sock_common_setsockopt+0x23/0x2f
 [c03d2d69] sys_setsockopt+0x61/0xac
 [c03d33f3] sys_socketcall+0x1e9/0x249
 [c0114348] do_page_fault+0x0/0x664
 [c0102bc5] sysenter_past_esp+0x56/0x79
 [c047007b] svc_recv+0x9c/0x3f5
 ===
Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 
5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 8b 56 
68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7

EIP: [c0462272] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08


Unable to handle kernel paging request at virtual address f8a3b00c
 printing eip:
c03cce45
*pde = 
Oops:  [#13]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[c03cce45]Not tainted VLI
EFLAGS: 00010246   (2.6.16.36-1-bigmem #1)
EIP is at translate_table+0x47b/0xfc2
eax: d8fbbc3c   ebx: 0098   ecx: c049b780   edx: 
esi: f8a3afa0   edi: 000e   ebp: 0001   esp: d8fbbb7c
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550)
Stack: 0c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 
d30f4260
   00d2 d8fba000 d30f4240 d8fba000 0028 0004  
0004
    0fa0 0fd0 f8a8e000  f8a38000  


Call Trace:
 [c03cdbd0] do_replace+0x16b/0x887
 [c03ced74] copy_everything_to_user+0x21a/0x35c
 [c03ceef6] do_ebt_set_ctl+0x40/0x42
 [c0354ee0] nf_sockopt+0x11f/0x121
 [c0354f19] nf_setsockopt+0x37/0x3b
 [c0360b14] ip_setsockopt+0x3f9/0xb0e
 [c0354e6e] nf_sockopt+0xad/0x121
 [c0354f54] nf_getsockopt+0x37/0x3b
 [c03617e6] ip_getsockopt+0x5bd/0x62b
 [c012360e] current_fs_time+0x5d/0x78
 [c0178813] touch_atime+0x7d/0xcd
 [c014b366] zap_pte_range+0xf1/0x316
 [c014b68e] unmap_page_range+0x103/0x174
 [c02228a7] prio_tree_remove+0x77/0xe7
 [c014358c] buffered_rmqueue+0x155/0x209
 [c014358c] buffered_rmqueue+0x155/0x209
 [c014376e] get_page_from_freelist+0x8c/0xa6
 [c014376e] get_page_from_freelist+0x8c/0xa6
 [c01437de] __alloc_pages+0x56/0x309
 [c015274c] page_add_file_rmap+0x2a/0x2c
 [c014d48d] do_anonymous_page+0x122/0x22a
 [c014dabd] __handle_mm_fault+0x138/0x326
 [c03391e6] sock_common_setsockopt+0x33/0x37
 [c0336c88] sys_setsockopt+0x6c/0xb2
 [c033739a] sys_socketcall+0x1f4/0x254
 [c01160e5] do_page_fault+0x0/0x630
 [c0102c7f] sysenter_past_esp+0x54/0x75
Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 
44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 8b 4e 
6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00



It seems to happen when flushing a user-defined ebtable, or removing a 
rule -- but not every time. It leaves the ebtable userspace process in D 
state on 2.6.19.1 but not on 2.6.16.36 (?).


Considering I've never had these problems before, and that both stable 
(2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to 
guess that it's something that went into both of them very recently.


Just a follow-up -- this doesn't happen with 2.6.19.

-Chris


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36

2006-12-23 Thread Christopher S. Aker

Patrick McHardy wrote:

I'm trying to reproduce this (without success so far), please send your
kernel config and your ebtables script.

You could try if 2.6.19 works, there were some ebtables changes in
2.6.19.1 that touched this code.


We're hitting this too, on both 2.6.16.36 and 2.6.19.1.

BUG: unable to handle kernel paging request at virtual address f8cec008
 printing eip:
c0462272
*pde = 
Oops:  [#1]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.19.1-1-bigmem #1)
EIP is at translate_table+0x2b3/0xddf
eax: f8ce2000   ebx: 0004   ecx: f6d53e90   edx: f8ce2000
esi: f8cebfa0   edi: 000e   ebp:    esp: f6d53e08
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000)
Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028  

   0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90  
8000
      0004 0014  0014 
0600

Call Trace:
 [] do_replace+0x113/0x6da
 [] get_page_from_freelist+0x8c/0xa8
 [] do_ebt_set_ctl+0x2d/0x2e
 [] nf_sockopt+0xfa/0xfc
 [] nf_setsockopt+0x23/0x2b
 [] ip_setsockopt+0x86/0x91
 [] sock_common_setsockopt+0x23/0x2f
 [] sys_setsockopt+0x61/0xac
 [] sys_socketcall+0x1e9/0x249
 [] do_page_fault+0x0/0x664
 [] sysenter_past_esp+0x56/0x79
 [] svc_recv+0x9c/0x3f5
 ===
Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 
5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 <8b> 56 
68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7

EIP: [] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08


Unable to handle kernel paging request at virtual address f8a3b00c
 printing eip:
c03cce45
*pde = 
Oops:  [#13]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[]Not tainted VLI
EFLAGS: 00010246   (2.6.16.36-1-bigmem #1)
EIP is at translate_table+0x47b/0xfc2
eax: d8fbbc3c   ebx: 0098   ecx: c049b780   edx: 
esi: f8a3afa0   edi: 000e   ebp: 0001   esp: d8fbbb7c
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550)
Stack: <0>c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 
d30f4260
   00d2 d8fba000 d30f4240 d8fba000 0028 0004  
0004
    0fa0 0fd0 f8a8e000  f8a38000  


Call Trace:
 [] do_replace+0x16b/0x887
 [] copy_everything_to_user+0x21a/0x35c
 [] do_ebt_set_ctl+0x40/0x42
 [] nf_sockopt+0x11f/0x121
 [] nf_setsockopt+0x37/0x3b
 [] ip_setsockopt+0x3f9/0xb0e
 [] nf_sockopt+0xad/0x121
 [] nf_getsockopt+0x37/0x3b
 [] ip_getsockopt+0x5bd/0x62b
 [] current_fs_time+0x5d/0x78
 [] touch_atime+0x7d/0xcd
 [] zap_pte_range+0xf1/0x316
 [] unmap_page_range+0x103/0x174
 [] prio_tree_remove+0x77/0xe7
 [] buffered_rmqueue+0x155/0x209
 [] buffered_rmqueue+0x155/0x209
 [] get_page_from_freelist+0x8c/0xa6
 [] get_page_from_freelist+0x8c/0xa6
 [] __alloc_pages+0x56/0x309
 [] page_add_file_rmap+0x2a/0x2c
 [] do_anonymous_page+0x122/0x22a
 [] __handle_mm_fault+0x138/0x326
 [] sock_common_setsockopt+0x33/0x37
 [] sys_setsockopt+0x6c/0xb2
 [] sys_socketcall+0x1f4/0x254
 [] do_page_fault+0x0/0x630
 [] sysenter_past_esp+0x54/0x75
Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 
44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 <8b> 4e 
6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00



It seems to happen when flushing a user-defined ebtable, or removing a 
rule -- but not every time. It leaves the ebtable userspace process in D 
state on 2.6.19.1 but not on 2.6.16.36 (?).


Considering I've never had these problems before, and that both stable 
(2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to 
guess that it's something that went into both of them very recently.


-Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ebtables problems on 2.6.19.1 *and* 2.6.16.36

2006-12-23 Thread Christopher S. Aker

Patrick McHardy wrote:

I'm trying to reproduce this (without success so far), please send your
kernel config and your ebtables script.

You could try if 2.6.19 works, there were some ebtables changes in
2.6.19.1 that touched this code.


We're hitting this too, on both 2.6.16.36 and 2.6.19.1.

BUG: unable to handle kernel paging request at virtual address f8cec008
 printing eip:
c0462272
*pde = 
Oops:  [#1]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[c0462272]Not tainted VLI
EFLAGS: 00010286   (2.6.19.1-1-bigmem #1)
EIP is at translate_table+0x2b3/0xddf
eax: f8ce2000   ebx: 0004   ecx: f6d53e90   edx: f8ce2000
esi: f8cebfa0   edi: 000e   ebp:    esp: f6d53e08
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 4788, ti=f6d52000 task=f6d51550 task.ti=f6d52000)
Stack: f6d53e40 c0540440 0007 f6d53ebc 0001 0028  

   0004 0fa0 0fd0 f8d38000 f8ce2000 f6d53e90  
8000
      0004 0014  0014 
0600

Call Trace:
 [c0462f5f] do_replace+0x113/0x6da
 [c0142267] get_page_from_freelist+0x8c/0xa8
 [c0463f4c] do_ebt_set_ctl+0x2d/0x2e
 [c03efbc2] nf_sockopt+0xfa/0xfc
 [c03efbe7] nf_setsockopt+0x23/0x2b
 [c03fac35] ip_setsockopt+0x86/0x91
 [c03d54ef] sock_common_setsockopt+0x23/0x2f
 [c03d2d69] sys_setsockopt+0x61/0xac
 [c03d33f3] sys_socketcall+0x1e9/0x249
 [c0114348] do_page_fault+0x0/0x664
 [c0102bc5] sysenter_past_esp+0x56/0x79
 [c047007b] svc_recv+0x9c/0x3f5
 ===
Code: 30 3b 28 0f 83 5c 02 00 00 8b 54 24 30 8b 74 24 24 8b 4c 24 34 8b 
5c 24 4c 03 72 24 8b 79 20 89 5c 24 20 c7 44 24 1c 00 00 00 00 8b 56 
68 8b 46 6c 29 d0 31 d2 89 44 24 14 8b 06 85 c0 0f 84 f7

EIP: [c0462272] translate_table+0x2b3/0xddf SS:ESP 0068:f6d53e08


Unable to handle kernel paging request at virtual address f8a3b00c
 printing eip:
c03cce45
*pde = 
Oops:  [#13]
SMP
Modules linked in: e1000
CPU:1
EIP:0060:[c03cce45]Not tainted VLI
EFLAGS: 00010246   (2.6.16.36-1-bigmem #1)
EIP is at translate_table+0x47b/0xfc2
eax: d8fbbc3c   ebx: 0098   ecx: c049b780   edx: 
esi: f8a3afa0   edi: 000e   ebp: 0001   esp: d8fbbb7c
ds: 007b   es: 007b   ss: 0068
Process ebtables (pid: 7917, threadinfo=d8fba000 task=e7892550)
Stack: 0c049b75c f8a3af78 c04468f8 d8fbbbcc c049b740 0007 d8fbbc68 
d30f4260
   00d2 d8fba000 d30f4240 d8fba000 0028 0004  
0004
    0fa0 0fd0 f8a8e000  f8a38000  


Call Trace:
 [c03cdbd0] do_replace+0x16b/0x887
 [c03ced74] copy_everything_to_user+0x21a/0x35c
 [c03ceef6] do_ebt_set_ctl+0x40/0x42
 [c0354ee0] nf_sockopt+0x11f/0x121
 [c0354f19] nf_setsockopt+0x37/0x3b
 [c0360b14] ip_setsockopt+0x3f9/0xb0e
 [c0354e6e] nf_sockopt+0xad/0x121
 [c0354f54] nf_getsockopt+0x37/0x3b
 [c03617e6] ip_getsockopt+0x5bd/0x62b
 [c012360e] current_fs_time+0x5d/0x78
 [c0178813] touch_atime+0x7d/0xcd
 [c014b366] zap_pte_range+0xf1/0x316
 [c014b68e] unmap_page_range+0x103/0x174
 [c02228a7] prio_tree_remove+0x77/0xe7
 [c014358c] buffered_rmqueue+0x155/0x209
 [c014358c] buffered_rmqueue+0x155/0x209
 [c014376e] get_page_from_freelist+0x8c/0xa6
 [c014376e] get_page_from_freelist+0x8c/0xa6
 [c01437de] __alloc_pages+0x56/0x309
 [c015274c] page_add_file_rmap+0x2a/0x2c
 [c014d48d] do_anonymous_page+0x122/0x22a
 [c014dabd] __handle_mm_fault+0x138/0x326
 [c03391e6] sock_common_setsockopt+0x33/0x37
 [c0336c88] sys_setsockopt+0x6c/0xb2
 [c033739a] sys_socketcall+0x1f4/0x254
 [c01160e5] do_page_fault+0x0/0x630
 [c0102c7f] sysenter_past_esp+0x54/0x75
Code: 24 8b bc 24 8c 00 00 00 8b 84 24 88 00 00 00 8b 54 24 64 8b 74 24 
44 03 77 24 8b 78 20 c7 44 24 38 00 00 00 00 89 54 24 3c 31 d2 8b 4e 
6c 8b 5e 68 29 d9 89 4c 24 30 8b 06 85 c0 0f 84 14 02 00



It seems to happen when flushing a user-defined ebtable, or removing a 
rule -- but not every time. It leaves the ebtable userspace process in D 
state on 2.6.19.1 but not on 2.6.16.36 (?).


Considering I've never had these problems before, and that both stable 
(2.6.16.36) and current (2.6.19.1) exhibit this issue, I'd venture to 
guess that it's something that went into both of them very recently.


-Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/