oops with kernel 2.4.5

2001-06-08 Thread szonyi calin

Hi
we found in logs a oops and here are the results from
ksymoops (2.4.1)

Unable to handle kernel NULL pointer dereference at
virtual address 0004
c012db89
*pde = 
Oops: 0002
CPU:0
EIP:0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax:    ebx: c00725c0   ecx: c00725c0   edx:
0004
esi: c00725c0   edi: c00725c0   ebp:    esp:
c10a9f70
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 3, stackpage=c10a9000)
Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90
0027 c00725c0 0003 
   c0126cb1 c1076e78   0004
 0008e000  
    0004  003c c0127551
0004  c10a8000 
Call Trace: [] [] []
[] [] [] 
Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50
51 e8 0d ffUnable to handle kernel NULL pointer
dereference at virtual address 0004
c012db89
*pde = 
Oops: 0002
CPU:0
EIP:0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax:    ebx: c00725c0   ecx: c00725c0   edx:
0004
esi: c00725c0   edi: c00725c0   ebp:    esp:
c10a9f70
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 3, stackpage=c10a9000)
Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90
0027 c00725c0 0003 
   c0126cb1 c1076e78   0004
 0008e000  
    0004  003c c0127551
0004  c10a8000 
Call Trace: [] [] []
[] [] [] 
Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50
51 e8 0d ff
>>EIP; c012db89 <__remove_from_queues+19/34>   <=
Trace; c0130092 
Trace; c0126cb1 
Trace; c0127551 
Trace; c01275df 
Trace; c0105000 
Code;  c012db89 <__remove_from_queues+19/34>
 <_EIP>:
Code;  c012db89 <__remove_from_queues+19/34>   <=
   0:   89 02 movl   %eax,(%edx)  
<=
Code;  c012db8b <__remove_from_queues+1b/34>
   2:   c7 41 30 00 00 00 00  movl  
$0x0,0x30(%ecx)
Code;  c012db92 <__remove_from_queues+22/34>
   9:   31 c0 xorl   %eax,%eax
Code;  c012db94 <__remove_from_queues+24/34>
   b:   66 8b 41 0a   movw   0xa(%ecx),%ax
Code;  c012db98 <__remove_from_queues+28/34>
   f:   50pushl  %eax
Code;  c012db99 <__remove_from_queues+29/34>
  10:   51pushl  %ecx
Code;  c012db9a <__remove_from_queues+2a/34>
  11:   e8 0d ff 00 00call   ff23
<_EIP+0xff23> c013daac >>EIP;
c012db89 <__remove_from_queues+19/34>   <=
Trace; c0130092 
Trace; c0126cb1 
Trace; c0127551 
Trace; c01275df 
Trace; c0105000 
Code;  c012db89 <__remove_from_queues+19/34>
 <_EIP>:
Code;  c012db89 <__remove_from_queues+19/34>   <=
   0:   89 02 movl   %eax,(%edx)  
<=
Code;  c012db8b <__remove_from_queues+1b/34>
   2:   c7 41 30 00 00 00 00  movl  
$0x0,0x30(%ecx)
Code;  c012db92 <__remove_from_queues+22/34>
   9:   31 c0 xorl   %eax,%eax
Code;  c012db94 <__remove_from_queues+24/34>
   b:   66 8b 41 0a   movw   0xa(%ecx),%ax
Code;  c012db98 <__remove_from_queues+28/34>
   f:   50pushl  %eax
Code;  c012db99 <__remove_from_queues+29/34>
  10:   51pushl  %ecx
Code;  c012db9a <__remove_from_queues+2a/34>
  11:   e8 0d ff 00 00call   ff23
<_EIP+0xff23> c013daac 

Well ?


__
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: oops with kernel 2.4.5

2001-06-08 Thread Adam


well, my guess is that the compiler misscompiles your kernel.

stil _contrary_ to REPORTING_BUGS file you did not gave any info about
your system.

some usefull stuff you should email are (adjust it to your setup)

a)

cd /usr/src/linux
rm fs/buffer.o
make fs/buffer.o

email output of the make then find out what gcc was used (gcc,kgcc etc)
and email what gcc it was, ie
b)

gcc -v

then run following command
c)

gdb vmlinux

disassemble __remove_from_queues

in gdb run the above command and email output of all the 3 above,
then ppl on LKML might be able to help you better.

On Fri, 8 Jun 2001, szonyi calin wrote:

> Hi
> we found in logs a oops and here are the results from
> ksymoops (2.4.1)
>
> Unable to handle kernel NULL pointer dereference at
> virtual address 0004
> c012db89
> *pde = 
> Oops: 0002
> CPU:0
> EIP:0010:[]
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010246
> eax:    ebx: c00725c0   ecx: c00725c0   edx:
> 0004
> esi: c00725c0   edi: c00725c0   ebp:    esp:
> c10a9f70
> ds: 0018   es: 0018   ss: 0018
> Process kswapd (pid: 3, stackpage=c10a9000)
> Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90
> 0027 c00725c0 0003
>c0126cb1 c1076e78   0004
>  0008e000 
> 0004  003c c0127551
> 0004  c10a8000
> Call Trace: [] [] []
> [] [] [ 0105463>]
> Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50
> 51 e8 0d ffUnable to handle kernel NULL pointer
> dereference at virtual address 0004
> c012db89
> *pde = 
> Oops: 0002
> CPU:0
> EIP:0010:[]
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010246
> eax:    ebx: c00725c0   ecx: c00725c0   edx:
> 0004
> esi: c00725c0   edi: c00725c0   ebp:    esp:
> c10a9f70
> ds: 0018   es: 0018   ss: 0018
> Process kswapd (pid: 3, stackpage=c10a9000)
> Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90
> 0027 c00725c0 0003
>c0126cb1 c1076e78   0004
>  0008e000 
> 0004  003c c0127551
> 0004  c10a8000
> Call Trace: [] [] []
> [] [] [ 0105463>]
> Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50
> 51 e8 0d ff
> >>EIP; c012db89 <__remove_from_queues+19/34>   <=
> Trace; c0130092 
> Trace; c0126cb1 
> Trace; c0127551 
> Trace; c01275df 
> Trace; c0105000 
> Code;  c012db89 <__remove_from_queues+19/34>
>  <_EIP>:
> Code;  c012db89 <__remove_from_queues+19/34>   <=
>0:   89 02 movl   %eax,(%edx)
> <=
> Code;  c012db8b <__remove_from_queues+1b/34>
>2:   c7 41 30 00 00 00 00  movl
> $0x0,0x30(%ecx)
> Code;  c012db92 <__remove_from_queues+22/34>
>9:   31 c0 xorl   %eax,%eax
> Code;  c012db94 <__remove_from_queues+24/34>
>b:   66 8b 41 0a   movw   0xa(%ecx),%ax
> Code;  c012db98 <__remove_from_queues+28/34>
>f:   50pushl  %eax
> Code;  c012db99 <__remove_from_queues+29/34>
>   10:   51pushl  %ecx
> Code;  c012db9a <__remove_from_queues+2a/34>
>   11:   e8 0d ff 00 00call   ff23
> <_EIP+0xff23> c013daac >>EIP;
> c012db89 <__remove_from_queues+19/34>   <=
> Trace; c0130092 
> Trace; c0126cb1 
> Trace; c0127551 
> Trace; c01275df 
> Trace; c0105000 
> Code;  c012db89 <__remove_from_queues+19/34>
>  <_EIP>:
> Code;  c012db89 <__remove_from_queues+19/34>   <=
>0:   89 02 movl   %eax,(%edx)
> <=
> Code;  c012db8b <__remove_from_queues+1b/34>
>2:   c7 41 30 00 00 00 00  movl
> $0x0,0x30(%ecx)
> Code;  c012db92 <__remove_from_queues+22/34>
>9:   31 c0 xorl   %eax,%eax
> Code;  c012db94 <__remove_from_queues+24/34>
>b:   66 8b 41 0a   movw   0xa(%ecx),%ax
> Code;  c012db98 <__remove_from_queues+28/34>
>f:   50pushl  %eax
> Code;  c012db99 <__remove_from_queues+29/34>
>   10:   51pushl  %ecx
> Code;  c012db9a <__remove_from_queues+2a/34>
>   11:   e8 0d ff 00 00call   ff23
> <_EIP+0xff23> c013daac 
>
> Well ?
>
>
> __
> Do You Yahoo!?
> Get personalized email addresses from Yahoo! Mail - only $35
> a year!  http://personal.mail.yahoo.com/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
Adam
http://www.eax.com  The Supreme Headquarters of the 32 bit registers


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ

Oops with kernel 2.4.5 on heavy disk traffic

2001-06-10 Thread Alois Treindl


I run kernel 2.4.5 on Dell Poweredge 2450 with 1.5 Gb RAM
and an onboard adaptec disk driver, dual pentium III 933 Mhz,
3 disks (160 mb transfer rate, 36 Gb each).

When I put the system under heavy load today (load level 15, about 20
httpd processes and three concurrent copies of large file trees between
the various scsi disks),

there is considerable NFS traffic from/to 3 clients going on.

I am not subscribed to the kernel mailing list, so please 
cc to [EMAIL PROTECTED] for questions or replies.

I have got a kernel oops message:

- begin oops ---
ksymoops 2.4.1 on i686 2.4.5.  Options used
Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) not 
found in System.map.  Ignoring ksyms_base entry
Unable to handle kernel NULL pointer dereference at virtual address 001a
c014e1e2
*pde = 
Oops: 0002
CPU:0
EIP:0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: 000a   ebx: f293dba0   ecx: c02376bc   edx: e8cca000
esi: c02377c0   edi: 000b   ebp: 0004   esp: eb2d1e8c
ds: 0018   es: 0018   ss: 0018
Process top (pid: 4066, stackpage=eb2d1000)
Stack: c014aaaf f293dba0 e8cca000 c299c000 c014f409 f293dba0 c0237b00 f3016444 
   f293d7e0 c014f660 c299c000 e8cca000 000b fff4 eb2d f30163e0 
   f293d7e0 c02007aa e8cca000 ffea c013f983 f293d7e0 f30163e0  
Call Trace: [] [] [] [] [] 
[] [] 
   [] [] [] 
Code: f0 ff 48 10 8b 42 24 80 48 14 08 52 e8 ed fe ff ff 83 c4 04 

>>EIP; c014e1e2<=
Trace; c014aaaf 
Trace; c014f409 
Trace; c014f660 
Trace; c013f983 
Trace; c014019b 
Trace; c0140abb 
Trace; c0133d83 
Trace; c0134090 
Trace; c0106efb 
Trace; c010002b 
Code;  c014e1e2 
 <_EIP>:
Code;  c014e1e2<=
   0:   f0 ff 48 10   lock decl 0x10(%eax)   <=
Code;  c014e1e6 
   4:   8b 42 24  mov0x24(%edx),%eax
Code;  c014e1e9 
   7:   80 48 14 08   orb$0x8,0x14(%eax)
Code;  c014e1ed 
   b:   52push   %edx
Code;  c014e1ee 
   c:   e8 ed fe ff ffcall   fefe <_EIP+0xfefe> c014e0e0 

Code;  c014e1f3 
  11:   83 c4 04  add$0x4,%esp

--- end oops ---

proc/modules has:
nfsd   71152  20 (autoclean)
lockd  50048   1 (autoclean) [nfsd]
sunrpc 64272   1 (autoclean) [nfsd lockd]
autofs 11168   1 (autoclean)
st 28016   0 (unused)
aic7xxx   110256   4   

Here is my kernel config file:
- begin .config -

#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
CONFIG_MPENTIUMIII=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_PGE=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_TOSHIBA is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_SMP=y
CONFIG_HAVE_DEC_LOCK=y

#
# General setup
#
CONFIG_NET=y
# CONFIG_VISWS is not set
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_HOTPLUG is not set
# CONFIG_PCMCIA is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
# CONFIG_KCORE_AOUT is not set
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m
# CONFIG_PM is not set
# CONFIG_ACPI is not set
# CONFIG_APM is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play configuration
#
# CONFIG_PNP is not set
# CONFIG_ISAPNP is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
CONFIG_BLK_CPQ_DA=m
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_BLK_DEV_DAC960=m
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=y
CONF

Oops with kernel 2.4.5 on heavy disk traffic - reproduce

2001-06-10 Thread Alois Treindl

I have reported before a kernel oops.

I now oberserved the same oops, with the same stack trace,
and a Dell Poweredge 1550 with dual CPU, 1 gb RAM, only
one disk and with little disk usage (most file activity via
NFS, where this system is a client).

The kernel is identical to the one reported before,
and the stack trace is identical too.

The oops occurred with the process 'top'

I have not yet found a way to reproduce the crash in a systematic
way, I am running the kernel now on 4 different machines
under heavy load, and will try to reproduce.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] Re: Oops with kernel 2.4.5 on heavy disk traffic

2001-06-10 Thread Alexander Viro

Please, apply. What's happing here is simple - we set i_ino by
PID and get something out of range of per-process inode. Confusion
follows... Fix: move initializing ->u.proc_i.task past the check.
Then proc_delete_inode() will be happy with it.
Alois, Bryce - that ought to fix the oopsen you see.

--- linux/fs/proc/base.c.oldSun Jun 10 11:15:55 2001
+++ linux/fs/proc/base.cSun Jun 10 11:21:51 2001
@@ -635,15 +635,14 @@
inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
inode->i_ino = fake_ino(task->pid, ino);
 
-   inode->u.proc_i.file = NULL;
+   if (!task->pid)
+   goto out_unlock;
+
/*
 * grab the reference to task.
 */
-   inode->u.proc_i.task = task;
get_task_struct(task);
-   if (!task->pid)
-   goto out_unlock;
-
+   inode->u.proc_i.task = task;
inode->i_uid = 0;
inode->i_gid = 0;
if (ino == PROC_PID_INO || task->dumpable) {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Re: Oops with kernel 2.4.5 on heavy disk traffic

2001-06-10 Thread Alois Treindl

On Sun, 10 Jun 2001, Alexander Viro wrote:

>   Please, apply. What's happing here is simple - we set i_ino by
> PID and get something out of range of per-process inode. Confusion
> follows... Fix: move initializing ->u.proc_i.task past the check.
> Then proc_delete_inode() will be happy with it.
>   Alois, Bryce - that ought to fix the oopsen you see.

Alexander

do I read this right: this is not a very critical bug?
In my case, it was 'top' which crashed twice (I was unable to reproduce
this while trying hard in the last 4 hours, after the original two cases). 

Are any processes which are not - like top or ps - trying to read
the /proc file system likely to be affected by the bug?

I am a bit worried about applying 'unauthorized' kernel paches to my
server. This has created problems for me in the past.

So, it the bug is non critical, I would rather accept the occasional
crash of 'top' or 'ps' than playing around with kernel code.

Please, comment.
 
Alois

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/