I stumbled across a BUG in the 2.4.0.t10 kernel. It occurs after
some time and after some days of number crunching and disk usage. It is
triggered most frequently by tripwire runs after upgrades, which actually
stress both the disk and the CPU (calculating signatures).

Here is the output of ver_linux:
Linux serverlx 2.4.0-test10 #1 Fri Nov 3 13:52:55 CET 2000 i586 unknown
Kernel modules         2.3.17
Gnu C                  2.95.2
Gnu Make               3.79.1
Binutils               2.9.5.0.37
Linux C Library        2.1.3
Dynamic linker         ldd: version 1.9.11
Procps                 2.0.6
Mount                  2.10f
Net-tools              2.05
Console-tools          0.2.3
Sh-utils               2.0
Modules Loaded         nfsd soundcore autofs rtc ipt_REJECT iptable_filter
serial isa-pnp usb-ohci usbcore ip_conntrack_ftp ip_conntrack ip_tables
ipx nfs lockd sunrpc de4x5

Here is the relevant snippet of dmesg:
nfs warning: mount version older than kernel
kernel BUG at page_alloc.c:214!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c012a0da>]
EFLAGS: 00010286
eax: 00000020   ebx: c1361a04   ecx: c27c0000   edx: c0360160
esi: 00000001   edi: c0270f80   ebp: 00000000   esp: c27c1ed0
ds: 0018   es: 0018   ss: 0018
Process tripwire (pid: 29734, stackpage=c27c1000)
Stack: c02190ea c0219398 000000d6 c0270f58 c0271130 00000002 00000000
0000bbad 
       00000282 00000000 c0270f58 c012a18a 00000000 c0271138 00000000
00000000 
       c012a2cc c027112c 00000000 00000002 00000001 00000000 00000000
c49e19fc 
Call Trace: [<c02190ea>] [<c0219398>] [<c012a18a>] [<c012a2cc>]
[<c01232b3>] [<c0123521>] [<c0123460>] 
       [<c012d925>] [<c010a2f3>] 
Code: 0f 0b 83 c4 0c 90 89 d8 eb 14 47 83 c6 0c 83 ff 09 0f 86 ef 
kernel BUG at page_alloc.c:84!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0129bc9>]
EFLAGS: 00010282
eax: 0000001f   ebx: c1361a04   ecx: c15e8000   edx: 00000000
esi: c1361a20   edi: 00005561   ebp: 00000000   esp: c15e9f80
ds: 0018   es: 0018   ss: 0018
Process kflushd (pid: 5, stackpage=c15e9000)
Stack: c02190ea c0219398 00000054 c1361a04 c1361a20 00005561 00000007
00005561 
       00000006 00000000 00000001 c0128d75 c012a4db c0128f3b 00000000
00000001 
       c15e8000 0008e000 00000000 00000000 00000000 00000000 c013127e
00000003 
Call Trace: [<c02190ea>] [<c0219398>] [<c0128d75>] [<c012a4db>]
[<c0128f3b>] [<c013127e>] [<c01089cc>] 
Code: 0f 0b 83 c4 0c 89 f6 89 d8 2b 05 b8 6e 2e c0 69 c0 f1 f0 f0 
kernel BUG at vmscan.c:532!
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0128a9d>]
EFLAGS: 00010286
eax: 0000001c   ebx: c1361a20   ecx: cb88c000   edx: c0360160
esi: c1361a04   edi: c1daccdc   ebp: 000001dd   esp: cb88de18
ds: 0018   es: 0018   ss: 0018
Process tripwire (pid: 30247, stackpage=cb88d000)
Stack: c021896a c0218c09 00000214 c0270f58 c0271158 00000001 00000000
c012a178 
       c0270f58 00000000 c0271160 00000000 00000007 c012a2e9 c0271154
00000000 
       00000001 00000001 00000040 00000007 c144c0c0 00000007 00000007
00000001 
Call Trace: [<c021896a>] [<c0218c09>] [<c012a178>] [<c012a2e9>]
[<c012a490>] [<c012722a>] [<c01273c0>] 
       [<c013fca1>] [<c013ff66>] [<c014fd71>] [<c0137083>] [<c0137767>]
[<c0137d4a>] [<c0134bb1>] [<c012d677>] 
       [<c010a2f3>] 
Code: 0f 0b 83 c4 0c 31 c0 0f b3 46 18 8d 4e 28 8d 46 2c 39 46 2c 


And here is the output of ksymoops:

ksymoops 2.3.4 on i586 2.4.0-test10.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.0-test10/ (default)
     -m /boot/System.map-2.4.0-test10 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

invalid operand: 0000
CPU:    0
EIP:    0010:[<c012a0da>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 00000020   ebx: c1361a04   ecx: c27c0000   edx: c0360160
esi: 00000001   edi: c0270f80   ebp: 00000000   esp: c27c1ed0
ds: 0018   es: 0018   ss: 0018
Process tripwire (pid: 29734, stackpage=c27c1000)
Stack: c02190ea c0219398 000000d6 c0270f58 c0271130 00000002 00000000 0000bbad 
       00000282 00000000 c0270f58 c012a18a 00000000 c0271138 00000000 00000000 
       c012a2cc c027112c 00000000 00000002 00000001 00000000 00000000 c49e19fc 
Call Trace: [<c02190ea>] [<c0219398>] [<c012a18a>] [<c012a2cc>] [<c01232b3>] 
[<c0123521>] [<c0123460>] 
       [<c012d925>] [<c010a2f3>] 
Code: 0f 0b 83 c4 0c 90 89 d8 eb 14 47 83 c6 0c 83 ff 09 0f 86 ef 

>>EIP; c012a0da <rmqueue+222/248>   <=====
Trace; c02190ea <tvecs+2a9e/ebdc>
Trace; c0219398 <tvecs+2d4c/ebdc>
Trace; c012a18a <__alloc_pages_limit+8a/ac>
Trace; c012a2cc <__alloc_pages+120/2d0>
Trace; c01232b3 <do_generic_file_read+35b/508>
Trace; c0123521 <generic_file_read+59/74>
Trace; c0123460 <file_read_actor+0/68>
Trace; c012d925 <sys_read+95/cc>
Trace; c010a2f3 <system_call+33/40>
Code;  c012a0da <rmqueue+222/248>
00000000 <_EIP>:
Code;  c012a0da <rmqueue+222/248>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c012a0dc <rmqueue+224/248>
   2:   83 c4 0c                  add    $0xc,%esp
Code;  c012a0df <rmqueue+227/248>
   5:   90                        nop    
Code;  c012a0e0 <rmqueue+228/248>
   6:   89 d8                     mov    %ebx,%eax
Code;  c012a0e2 <rmqueue+22a/248>
   8:   eb 14                     jmp    1e <_EIP+0x1e> c012a0f8 <rmqueue+240/248>
Code;  c012a0e4 <rmqueue+22c/248>
   a:   47                        inc    %edi
Code;  c012a0e5 <rmqueue+22d/248>
   b:   83 c6 0c                  add    $0xc,%esi
Code;  c012a0e8 <rmqueue+230/248>
   e:   83 ff 09                  cmp    $0x9,%edi
Code;  c012a0eb <rmqueue+233/248>
  11:   0f 86 ef 00 00 00         jbe    106 <_EIP+0x106> c012a1e0 
<__alloc_pages+34/2d0>

invalid operand: 0000
CPU:    0
EIP:    0010:[<c0129bc9>]
EFLAGS: 00010282
eax: 0000001f   ebx: c1361a04   ecx: c15e8000   edx: 00000000
esi: c1361a20   edi: 00005561   ebp: 00000000   esp: c15e9f80
ds: 0018   es: 0018   ss: 0018
Process kflushd (pid: 5, stackpage=c15e9000)
Stack: c02190ea c0219398 00000054 c1361a04 c1361a20 00005561 00000007 00005561 
       00000006 00000000 00000001 c0128d75 c012a4db c0128f3b 00000000 00000001 
       c15e8000 0008e000 00000000 00000000 00000000 00000000 c013127e 00000003 
Call Trace: [<c02190ea>] [<c0219398>] [<c0128d75>] [<c012a4db>] [<c0128f3b>] 
[<c013127e>] [<c01089cc>] 
Code: 0f 0b 83 c4 0c 89 f6 89 d8 2b 05 b8 6e 2e c0 69 c0 f1 f0 f0 

>>EIP; c0129bc9 <__free_pages_ok+49/338>   <=====
Trace; c02190ea <tvecs+2a9e/ebdc>
Trace; c0219398 <tvecs+2d4c/ebdc>
Trace; c0128d75 <page_launder+285/700>
Trace; c012a4db <__free_pages+13/14>
Trace; c0128f3b <page_launder+44b/700>
Trace; c013127e <bdflush+86/10c>
Trace; c01089cc <kernel_thread+28/38>
Code;  c0129bc9 <__free_pages_ok+49/338>
00000000 <_EIP>:
Code;  c0129bc9 <__free_pages_ok+49/338>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c0129bcb <__free_pages_ok+4b/338>
   2:   83 c4 0c                  add    $0xc,%esp
Code;  c0129bce <__free_pages_ok+4e/338>
   5:   89 f6                     mov    %esi,%esi
Code;  c0129bd0 <__free_pages_ok+50/338>
   7:   89 d8                     mov    %ebx,%eax
Code;  c0129bd2 <__free_pages_ok+52/338>
   9:   2b 05 b8 6e 2e c0         sub    0xc02e6eb8,%eax
Code;  c0129bd8 <__free_pages_ok+58/338>
   f:   69 c0 f1 f0 f0 00         imul   $0xf0f0f1,%eax,%eax

invalid operand: 0000
CPU:    0
EIP:    0010:[<c0128a9d>]
EFLAGS: 00010286
eax: 0000001c   ebx: c1361a20   ecx: cb88c000   edx: c0360160
esi: c1361a04   edi: c1daccdc   ebp: 000001dd   esp: cb88de18
ds: 0018   es: 0018   ss: 0018
Process tripwire (pid: 30247, stackpage=cb88d000)
Stack: c021896a c0218c09 00000214 c0270f58 c0271158 00000001 00000000 c012a178 
       c0270f58 00000000 c0271160 00000000 00000007 c012a2e9 c0271154 00000000 
       00000001 00000001 00000040 00000007 c144c0c0 00000007 00000007 00000001 
Call Trace: [<c021896a>] [<c0218c09>] [<c012a178>] [<c012a2e9>] [<c012a490>] 
[<c012722a>] [<c01273c0>] 
       [<c013fca1>] [<c013ff66>] [<c014fd71>] [<c0137083>] [<c0137767>] [<c0137d4a>] 
[<c0134bb1>] [<c012d677>] 
       [<c010a2f3>] 
Code: 0f 0b 83 c4 0c 31 c0 0f b3 46 18 8d 4e 28 8d 46 2c 39 46 2c 

>>EIP; c0128a9d <reclaim_page+355/3a8>   <=====
Trace; c021896a <tvecs+231e/ebdc>
Trace; c0218c09 <tvecs+25bd/ebdc>
Trace; c012a178 <__alloc_pages_limit+78/ac>
Trace; c012a2e9 <__alloc_pages+13d/2d0>
Trace; c012a490 <__get_free_pages+14/20>
Trace; c012722a <kmem_cache_grow+b6/208>
Trace; c01273c0 <kmem_cache_alloc+44/54>
Trace; c013fca1 <get_new_inode+19/140>
Trace; c013ff66 <iget4+c2/d4>
Trace; c014fd71 <ext2_lookup+59/80>
Trace; c0137083 <real_lookup+4f/bc>
Trace; c0137767 <path_walk+53f/778>
Trace; c0137d4a <__user_walk+3a/54>
Trace; c0134bb1 <sys_newlstat+15/70>
Trace; c012d677 <sys_close+43/54>
Trace; c010a2f3 <system_call+33/40>
Code;  c0128a9d <reclaim_page+355/3a8>
00000000 <_EIP>:
Code;  c0128a9d <reclaim_page+355/3a8>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c0128a9f <reclaim_page+357/3a8>
   2:   83 c4 0c                  add    $0xc,%esp
Code;  c0128aa2 <reclaim_page+35a/3a8>
   5:   31 c0                     xor    %eax,%eax
Code;  c0128aa4 <reclaim_page+35c/3a8>
   7:   0f b3 46 18               btr    %eax,0x18(%esi)
Code;  c0128aa8 <reclaim_page+360/3a8>
   b:   8d 4e 28                  lea    0x28(%esi),%ecx
Code;  c0128aab <reclaim_page+363/3a8>
   e:   8d 46 2c                  lea    0x2c(%esi),%eax
Code;  c0128aae <reclaim_page+366/3a8>
  11:   39 46 2c                  cmp    %eax,0x2c(%esi)


1 warning issued.  Results may not be reliable.

I hope I gave enough information to track down the problem. Let me know if
any more information/test are wanted to squash this bug, I will gladly
help. I am a developer, although not a kernel developer.

Thanks in advance

Giacomo Mulas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to