oops with kernel 2.4.5
Hi we found in logs a oops and here are the results from ksymoops (2.4.1) Unable to handle kernel NULL pointer dereference at virtual address 0004 c012db89 *pde = Oops: 0002 CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: ebx: c00725c0 ecx: c00725c0 edx: 0004 esi: c00725c0 edi: c00725c0 ebp: esp: c10a9f70 ds: 0018 es: 0018 ss: 0018 Process kswapd (pid: 3, stackpage=c10a9000) Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90 0027 c00725c0 0003 c0126cb1 c1076e78 0004 0008e000 0004 003c c0127551 0004 c10a8000 Call Trace: [] [] [] [] [] [] Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50 51 e8 0d ffUnable to handle kernel NULL pointer dereference at virtual address 0004 c012db89 *pde = Oops: 0002 CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: ebx: c00725c0 ecx: c00725c0 edx: 0004 esi: c00725c0 edi: c00725c0 ebp: esp: c10a9f70 ds: 0018 es: 0018 ss: 0018 Process kswapd (pid: 3, stackpage=c10a9000) Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90 0027 c00725c0 0003 c0126cb1 c1076e78 0004 0008e000 0004 003c c0127551 0004 c10a8000 Call Trace: [] [] [] [] [] [] Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50 51 e8 0d ff >>EIP; c012db89 <__remove_from_queues+19/34> <= Trace; c0130092 Trace; c0126cb1 Trace; c0127551 Trace; c01275df Trace; c0105000 Code; c012db89 <__remove_from_queues+19/34> <_EIP>: Code; c012db89 <__remove_from_queues+19/34> <= 0: 89 02 movl %eax,(%edx) <= Code; c012db8b <__remove_from_queues+1b/34> 2: c7 41 30 00 00 00 00 movl $0x0,0x30(%ecx) Code; c012db92 <__remove_from_queues+22/34> 9: 31 c0 xorl %eax,%eax Code; c012db94 <__remove_from_queues+24/34> b: 66 8b 41 0a movw 0xa(%ecx),%ax Code; c012db98 <__remove_from_queues+28/34> f: 50pushl %eax Code; c012db99 <__remove_from_queues+29/34> 10: 51pushl %ecx Code; c012db9a <__remove_from_queues+2a/34> 11: e8 0d ff 00 00call ff23 <_EIP+0xff23> c013daac >>EIP; c012db89 <__remove_from_queues+19/34> <= Trace; c0130092 Trace; c0126cb1 Trace; c0127551 Trace; c01275df Trace; c0105000 Code; c012db89 <__remove_from_queues+19/34> <_EIP>: Code; c012db89 <__remove_from_queues+19/34> <= 0: 89 02 movl %eax,(%edx) <= Code; c012db8b <__remove_from_queues+1b/34> 2: c7 41 30 00 00 00 00 movl $0x0,0x30(%ecx) Code; c012db92 <__remove_from_queues+22/34> 9: 31 c0 xorl %eax,%eax Code; c012db94 <__remove_from_queues+24/34> b: 66 8b 41 0a movw 0xa(%ecx),%ax Code; c012db98 <__remove_from_queues+28/34> f: 50pushl %eax Code; c012db99 <__remove_from_queues+29/34> 10: 51pushl %ecx Code; c012db9a <__remove_from_queues+2a/34> 11: e8 0d ff 00 00call ff23 <_EIP+0xff23> c013daac Well ? __ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops with kernel 2.4.5
well, my guess is that the compiler misscompiles your kernel. stil _contrary_ to REPORTING_BUGS file you did not gave any info about your system. some usefull stuff you should email are (adjust it to your setup) a) cd /usr/src/linux rm fs/buffer.o make fs/buffer.o email output of the make then find out what gcc was used (gcc,kgcc etc) and email what gcc it was, ie b) gcc -v then run following command c) gdb vmlinux disassemble __remove_from_queues in gdb run the above command and email output of all the 3 above, then ppl on LKML might be able to help you better. On Fri, 8 Jun 2001, szonyi calin wrote: > Hi > we found in logs a oops and here are the results from > ksymoops (2.4.1) > > Unable to handle kernel NULL pointer dereference at > virtual address 0004 > c012db89 > *pde = > Oops: 0002 > CPU:0 > EIP:0010:[] > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010246 > eax: ebx: c00725c0 ecx: c00725c0 edx: > 0004 > esi: c00725c0 edi: c00725c0 ebp: esp: > c10a9f70 > ds: 0018 es: 0018 ss: 0018 > Process kswapd (pid: 3, stackpage=c10a9000) > Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90 > 0027 c00725c0 0003 >c0126cb1 c1076e78 0004 > 0008e000 > 0004 003c c0127551 > 0004 c10a8000 > Call Trace: [] [] [] > [] [] [ 0105463>] > Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50 > 51 e8 0d ffUnable to handle kernel NULL pointer > dereference at virtual address 0004 > c012db89 > *pde = > Oops: 0002 > CPU:0 > EIP:0010:[] > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010246 > eax: ebx: c00725c0 ecx: c00725c0 edx: > 0004 > esi: c00725c0 edi: c00725c0 ebp: esp: > c10a9f70 > ds: 0018 es: 0018 ss: 0018 > Process kswapd (pid: 3, stackpage=c10a9000) > Stack: c0130092 c00725c0 c1076e94 c1076e78 c025eb90 > 0027 c00725c0 0003 >c0126cb1 c1076e78 0004 > 0008e000 > 0004 003c c0127551 > 0004 c10a8000 > Call Trace: [] [] [] > [] [] [ 0105463>] > Code: 89 02 c7 41 30 00 00 00 00 31 c0 66 8b 41 0a 50 > 51 e8 0d ff > >>EIP; c012db89 <__remove_from_queues+19/34> <= > Trace; c0130092 > Trace; c0126cb1 > Trace; c0127551 > Trace; c01275df > Trace; c0105000 > Code; c012db89 <__remove_from_queues+19/34> > <_EIP>: > Code; c012db89 <__remove_from_queues+19/34> <= >0: 89 02 movl %eax,(%edx) > <= > Code; c012db8b <__remove_from_queues+1b/34> >2: c7 41 30 00 00 00 00 movl > $0x0,0x30(%ecx) > Code; c012db92 <__remove_from_queues+22/34> >9: 31 c0 xorl %eax,%eax > Code; c012db94 <__remove_from_queues+24/34> >b: 66 8b 41 0a movw 0xa(%ecx),%ax > Code; c012db98 <__remove_from_queues+28/34> >f: 50pushl %eax > Code; c012db99 <__remove_from_queues+29/34> > 10: 51pushl %ecx > Code; c012db9a <__remove_from_queues+2a/34> > 11: e8 0d ff 00 00call ff23 > <_EIP+0xff23> c013daac >>EIP; > c012db89 <__remove_from_queues+19/34> <= > Trace; c0130092 > Trace; c0126cb1 > Trace; c0127551 > Trace; c01275df > Trace; c0105000 > Code; c012db89 <__remove_from_queues+19/34> > <_EIP>: > Code; c012db89 <__remove_from_queues+19/34> <= >0: 89 02 movl %eax,(%edx) > <= > Code; c012db8b <__remove_from_queues+1b/34> >2: c7 41 30 00 00 00 00 movl > $0x0,0x30(%ecx) > Code; c012db92 <__remove_from_queues+22/34> >9: 31 c0 xorl %eax,%eax > Code; c012db94 <__remove_from_queues+24/34> >b: 66 8b 41 0a movw 0xa(%ecx),%ax > Code; c012db98 <__remove_from_queues+28/34> >f: 50pushl %eax > Code; c012db99 <__remove_from_queues+29/34> > 10: 51pushl %ecx > Code; c012db9a <__remove_from_queues+2a/34> > 11: e8 0d ff 00 00call ff23 > <_EIP+0xff23> c013daac > > Well ? > > > __ > Do You Yahoo!? > Get personalized email addresses from Yahoo! Mail - only $35 > a year! http://personal.mail.yahoo.com/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Adam http://www.eax.com The Supreme Headquarters of the 32 bit registers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ
Oops with kernel 2.4.5 on heavy disk traffic
I run kernel 2.4.5 on Dell Poweredge 2450 with 1.5 Gb RAM and an onboard adaptec disk driver, dual pentium III 933 Mhz, 3 disks (160 mb transfer rate, 36 Gb each). When I put the system under heavy load today (load level 15, about 20 httpd processes and three concurrent copies of large file trees between the various scsi disks), there is considerable NFS traffic from/to 3 clients going on. I am not subscribed to the kernel mailing list, so please cc to [EMAIL PROTECTED] for questions or replies. I have got a kernel oops message: - begin oops --- ksymoops 2.4.1 on i686 2.4.5. Options used Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) not found in System.map. Ignoring ksyms_base entry Unable to handle kernel NULL pointer dereference at virtual address 001a c014e1e2 *pde = Oops: 0002 CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010206 eax: 000a ebx: f293dba0 ecx: c02376bc edx: e8cca000 esi: c02377c0 edi: 000b ebp: 0004 esp: eb2d1e8c ds: 0018 es: 0018 ss: 0018 Process top (pid: 4066, stackpage=eb2d1000) Stack: c014aaaf f293dba0 e8cca000 c299c000 c014f409 f293dba0 c0237b00 f3016444 f293d7e0 c014f660 c299c000 e8cca000 000b fff4 eb2d f30163e0 f293d7e0 c02007aa e8cca000 ffea c013f983 f293d7e0 f30163e0 Call Trace: [] [] [] [] [] [] [] [] [] [] Code: f0 ff 48 10 8b 42 24 80 48 14 08 52 e8 ed fe ff ff 83 c4 04 >>EIP; c014e1e2<= Trace; c014aaaf Trace; c014f409 Trace; c014f660 Trace; c013f983 Trace; c014019b Trace; c0140abb Trace; c0133d83 Trace; c0134090 Trace; c0106efb Trace; c010002b Code; c014e1e2 <_EIP>: Code; c014e1e2<= 0: f0 ff 48 10 lock decl 0x10(%eax) <= Code; c014e1e6 4: 8b 42 24 mov0x24(%edx),%eax Code; c014e1e9 7: 80 48 14 08 orb$0x8,0x14(%eax) Code; c014e1ed b: 52push %edx Code; c014e1ee c: e8 ed fe ff ffcall fefe <_EIP+0xfefe> c014e0e0 Code; c014e1f3 11: 83 c4 04 add$0x4,%esp --- end oops --- proc/modules has: nfsd 71152 20 (autoclean) lockd 50048 1 (autoclean) [nfsd] sunrpc 64272 1 (autoclean) [nfsd lockd] autofs 11168 1 (autoclean) st 28016 0 (unused) aic7xxx 110256 4 Here is my kernel config file: - begin .config - # # Automatically generated by make menuconfig: don't edit # CONFIG_X86=y CONFIG_ISA=y # CONFIG_SBUS is not set CONFIG_UID16=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # # Loadable module support # CONFIG_MODULES=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y # # Processor type and features # # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set CONFIG_MPENTIUMIII=y # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MCRUSOE is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_X86_TSC=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_PGE=y CONFIG_X86_USE_PPRO_CHECKSUM=y # CONFIG_TOSHIBA is not set CONFIG_MICROCODE=m CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_SMP=y CONFIG_HAVE_DEC_LOCK=y # # General setup # CONFIG_NET=y # CONFIG_VISWS is not set CONFIG_X86_IO_APIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GODIRECT is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_NAMES=y # CONFIG_EISA is not set # CONFIG_MCA is not set # CONFIG_HOTPLUG is not set # CONFIG_PCMCIA is not set CONFIG_SYSVIPC=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y # CONFIG_KCORE_AOUT is not set CONFIG_BINFMT_AOUT=m CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=m # CONFIG_PM is not set # CONFIG_ACPI is not set # CONFIG_APM is not set # # Memory Technology Devices (MTD) # # CONFIG_MTD is not set # # Parallel port support # # CONFIG_PARPORT is not set # # Plug and Play configuration # # CONFIG_PNP is not set # CONFIG_ISAPNP is not set # # Block devices # CONFIG_BLK_DEV_FD=y # CONFIG_BLK_DEV_XD is not set # CONFIG_PARIDE is not set CONFIG_BLK_CPQ_DA=m CONFIG_BLK_CPQ_CISS_DA=m CONFIG_BLK_DEV_DAC960=m CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_NBD=m CONFIG_BLK_DEV_RAM=y CONF
Oops with kernel 2.4.5 on heavy disk traffic - reproduce
I have reported before a kernel oops. I now oberserved the same oops, with the same stack trace, and a Dell Poweredge 1550 with dual CPU, 1 gb RAM, only one disk and with little disk usage (most file activity via NFS, where this system is a client). The kernel is identical to the one reported before, and the stack trace is identical too. The oops occurred with the process 'top' I have not yet found a way to reproduce the crash in a systematic way, I am running the kernel now on 4 different machines under heavy load, and will try to reproduce. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Re: Oops with kernel 2.4.5 on heavy disk traffic
Please, apply. What's happing here is simple - we set i_ino by PID and get something out of range of per-process inode. Confusion follows... Fix: move initializing ->u.proc_i.task past the check. Then proc_delete_inode() will be happy with it. Alois, Bryce - that ought to fix the oopsen you see. --- linux/fs/proc/base.c.oldSun Jun 10 11:15:55 2001 +++ linux/fs/proc/base.cSun Jun 10 11:21:51 2001 @@ -635,15 +635,14 @@ inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME; inode->i_ino = fake_ino(task->pid, ino); - inode->u.proc_i.file = NULL; + if (!task->pid) + goto out_unlock; + /* * grab the reference to task. */ - inode->u.proc_i.task = task; get_task_struct(task); - if (!task->pid) - goto out_unlock; - + inode->u.proc_i.task = task; inode->i_uid = 0; inode->i_gid = 0; if (ino == PROC_PID_INO || task->dumpable) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: Oops with kernel 2.4.5 on heavy disk traffic
On Sun, 10 Jun 2001, Alexander Viro wrote: > Please, apply. What's happing here is simple - we set i_ino by > PID and get something out of range of per-process inode. Confusion > follows... Fix: move initializing ->u.proc_i.task past the check. > Then proc_delete_inode() will be happy with it. > Alois, Bryce - that ought to fix the oopsen you see. Alexander do I read this right: this is not a very critical bug? In my case, it was 'top' which crashed twice (I was unable to reproduce this while trying hard in the last 4 hours, after the original two cases). Are any processes which are not - like top or ps - trying to read the /proc file system likely to be affected by the bug? I am a bit worried about applying 'unauthorized' kernel paches to my server. This has created problems for me in the past. So, it the bug is non critical, I would rather accept the occasional crash of 'top' or 'ps' than playing around with kernel code. Please, comment. Alois - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/