Repeated similar panics on -STABLE
Hello! We have three machines under relatively high load. They are running -STABLE on the same hardware with 2 processors (and SMP kernel). Periodically (approximately once a week) they panic with similar symptoms: # gdb -k kernel.debug vmcore.2 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-unknown-freebsd...Deprecated bfd_read called a t /mnt/se3/releng_4/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbx read.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /mnt/se3/releng_4/src/gnu/usr.bin/binutils/gdb/../ ../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf SMP 2 cpus IdlePTD at phsyical address 0x0034f000 initial pcb at physical address 0x002bd6a0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode mp_lock = 0102; cpuid = 1; lapic.id = fault virtual address = 0x5cdd8000 fault code = supervisor read, page not present instruction pointer = 0x8:0xc015daff stack pointer = 0x10:0xeb278e44 frame pointer = 0x10:0xeb278e68 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 65648 (cronolog) interrupt mask = net tty bio cam - SMP: XXX trap number = 12 panic: page fault mp_lock = 0102; cpuid = 1; lapic.id = boot() called on cpu#1 syncing disks... Fatal trap 12: page fault while in kernel mode mp_lock = 0103; cpuid = 1; lapic.id = fault virtual address = 0x5cdd8000 fault code = supervisor read, page not present instruction pointer = 0x8:0xc015daff stack pointer = 0x10:0xeb278b68 frame pointer = 0x10:0xeb278b8c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 65648 (cronolog) interrupt mask = net tty bio cam - SMP: XXX trap number = 12 panic: page fault mp_lock = 0103; cpuid = 1; lapic.id = boot() called on cpu#1 Uptime: 5d0h48m54s dumping to dev #da/0x20001, offset 2097280 dump 1023 1022 1021 1020 1019 1018 1017 1016 1015 1014 1013 1012 1011 1010 1009 snip --- #0 dumpsys () at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) bt #0 dumpsys () at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:487 #1 0xc01620c6 in boot (howto=260) at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:316 #2 0xc0162549 in panic (fmt=0xc028e3b9 %s) at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:595 #3 0xc0251b1a in trap_fatal (frame=0xeb278b28, eva=1558020096) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:974 #4 0xc0251775 in trap_pfault (frame=0xeb278b28, usermode=0, eva=1558020096) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:867 #5 0xc02512b7 in trap (frame={tf_fs = -65512, tf_es = -941031408, tf_ds = -942997488, tf_edi = -1070937504, tf_esi = -730301488, tf_ebp = -349729908, tf_isp = -349729964, tf_ebx = -1070870564, tf_edx = 1558020096, tf_ecx = 7, tf_eax = 128, tf_trapno = 12, tf_err = 0, tf_eip = -1072309505, tf_cs = 8, tf_eflags = 66054, tf_esp = 33281, tf_ss = -730301488}) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:466 #6 0xc015daff in malloc (size=128, type=0xc02aca60, flags=2) at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243 #7 0xc02085e1 in initiate_write_inodeblock (inodedep=0xc8e69400, bp=0xd4787bd0) at /mnt/se3/releng_4/src/sys/ufs/ffs/ffs_softdep.c:3091 #8 0xc02083b3 in softdep_disk_io_initiation (bp=0xd4787bd0) at /mnt/se3/releng_4/src/sys/ufs/ffs/ffs_softdep.c:2965 #9 0xc019d51a in spec_strategy (ap=0xeb278c0c) at /mnt/se3/releng_4/src/sys/miscfs/specfs/spec_vnops.c:453 #10 0xc0188cab in bwrite (bp=0xd4787bd0) at vnode_if.h:944 #11 0xc018e98f in vop_stdbwrite (ap=0xeb278c6c) at /mnt/se3/releng_4/src/sys/kern/vfs_default.c:344 #12 0xc018e791 in vop_defaultop (ap=0xeb278c6c) at /mnt/se3/releng_4/src/sys/kern/vfs_default.c:152 #13 0xc0189ce5 in vfs_bio_awrite (bp=0xd4787bd0) at vnode_if.h:1193 #14 0xc019d33f in spec_fsync (ap=0xeb278cd4) at /mnt/se3/releng_4/src/sys/miscfs/specfs/spec_vnops.c:391 #15 0xc020ca4d in ffs_sync (mp=0xc7ea1a00, waitfor=2, cred=0xc1c6e900, p=0xc02d25e0) at vnode_if.h:558 #16 0xc01941b7 in sync (p=0xc02d25e0, uap=0x0) at /mnt/se3/releng_4/src/sys/kern/vfs_syscalls.c:576 #17 0xc0161e7c in boot (howto=256) at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:235 #18 0xc0162549 in
Re: Repeated similar panics on -STABLE
Dmitry Sivachenko wrote: We have three machines under relatively high load. They are running -STABLE on the same hardware with 2 processors (and SMP kernel). Periodically (approximately once a week) they panic with similar symptoms: [ ... ] Panic. #18 0xc0162549 in panic (fmt=0xc028e3b9 %s) at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:595 #19 0xc0251b1a in trap_fatal (frame=0xeb278e04, eva=1558020096) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:974 #20 0xc0251775 in trap_pfault (frame=0xeb278e04, usermode=0, eva=1558020096) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:867 #21 0xc02512b7 in trap (frame={tf_fs = -107238, tf_es = -361627632, tf_ds = 16, tf_edi = -1070989600, tf_esi = -349729108, tf_ebp = -349729176, tf_isp = -349729232, tf_ebx = -1070870564, tf_edx = 1558020096, tf_ecx = 7, tf_eax = 128, tf_trapno = 12, tf_err = 0, tf_eip = -1072309505, tf_cs = 8, tf_eflags = 66054, tf_esp = 0, tf_ss = -349729108}) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:466 Page not present error. #22 0xc015daff in malloc (size=72, type=0xc029fee0, flags=0) at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243 Malloc failure was not checked for return value by source code; probably the kbp list was just refreshed, and while you were calling the failing malloc, the list was reemptied. What this generally means is that KVA was exhausted, and the caller did not expect that. To workaround: don't exhaust the KVA space; probably you have tuned some kernel parameter way too high. To fix: at line 243, you need to check if va is NULL; if it is, you need to wheck the M_WAITOK, and if set, restart the allocation. This has to be done before the next line, where va is dereferenced. Maybe something like: Change: va = kbp-kb_next; kbp-kb_next = ((struct freelist *)va)-next; To: va = kbp-kb_next; if (va == NULL) { if (flags M_NOWAIT) { splx(s); return ((void *) NULL); } goto restart; /* put this label above the while */ } kbp-kb_next = ((struct freelist *)va)-next; Working around the problem is easier (IMO): just change your tuning parameters to avoid running out of KVA. Probably your mbufs or mbufclusters are way to large, for your amount of physical RAM; remember that, except in very sepcial circumstances, kernel memory is non-pageable. #23 0xc015a3fe in exit1 (p=0xea726820, rv=15) at /mnt/se3/releng_4/src/sys/kern/kern_exit.c:166 It was trying to allocate a zombie structure. #24 0xc0164011 in sigexit (p=0xea726820, sig=15) at /mnt/se3/releng_4/src/sys/kern/kern_sig.c:1503 For a process someone sent a SIGTERM to, to kill it. #25 0xc0163d9c in postsig (sig=15) at /mnt/se3/releng_4/src/sys/kern/kern_sig.c:1406 #26 0xc0251fc5 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 174, tf_esi = 1049187701, tf_ebp = -1077936960, tf_isp = -349728812, tf_ebx = 1, tf_edx = 3, tf_ecx = -1078002496, tf_eax = 3, tf_trapno = 7, tf_err = 2, tf_eip = 672039098, tf_cs = 31, tf_eflags = 659, tf_esp = -1078069180, tf_ss = 47}) at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:174 Looks like you caused a floating point exception, and died when the exit1 failed to create a zombie structure for the process. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Repeated similar panics on -STABLE
On Wed, Apr 02, 2003 at 05:44:28PM +0400, Dmitry Sivachenko wrote: Hello! snip Fatal trap 12: page fault while in kernel mode mp_lock = 0102; cpuid = 1; lapic.id = fault virtual address = 0x5cdd8000 fault code = supervisor read, page not present instruction pointer = 0x8:0xc015daff BTW, (kgdb) list *0xc015daff 0xc015daff is in malloc (/mnt/se3/releng_4/src/sys/kern/kern_malloc.c:244). 239 freep-next = savedlist; 240 if (kbp-kb_last == NULL) 241 kbp-kb_last = (caddr_t)freep; 242 } 243 va = kbp-kb_next; 244 kbp-kb_next = ((struct freelist *)va)-next; 245 #ifdef INVARIANTS 246 freep = (struct freelist *)va; 247 savedtype = (const char *) freep-type-ks_shortdesc; 248 #if BYTE_ORDER == BIG_ENDIAN ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
ports and /var/db/pkg
hi all, is there some 'easy' way to resync /var/db/pkg from /usr/local (after some rm's on it?), i guess i could write a script to would try and match the info in /var/db/pkg, and if it's not where it's supposed to be would remove the info, but if there is a command ... thanks, danny ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Repeated similar panics on -STABLE
Dmitry Sivachenko wrote: On Wed, Apr 02, 2003 at 05:44:28PM +0400, Dmitry Sivachenko wrote: Hello! snip Fatal trap 12: page fault while in kernel mode mp_lock = 0102; cpuid = 1; lapic.id = fault virtual address = 0x5cdd8000 fault code = supervisor read, page not present instruction pointer = 0x8:0xc015daff BTW, (kgdb) list *0xc015daff 0xc015daff is in malloc (/mnt/se3/releng_4/src/sys/kern/kern_malloc.c:244). 243 va = kbp-kb_next; 244 kbp-kb_next = ((struct freelist *)va)-next; Yes, I know. See analysis and patch and workaround. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Returned mail--NOSHADE CLASS
The following mail can't be sent to [EMAIL PROTECTED]: From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: NOSHADE CLASS The attachment is the original mail ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
le0 - DE203 kernel config problem
I tried to get a DE203 NIC (ISA) working with 5.0R. Took the GENERIC config file and put device le 1 options COMPAT_OLDISA in it. Since I forgot how the card was programmed I tried and got it probed at io=0x200 so I put the following in /boot/device.hints hint.le.0.at=isa hint.le.0.disabled=0 hint.le.0.port=0x200 hint.le.0.irq=10 hint.le.0.maddr=0xd and first got an error during probe, something like le0: lemac expected IRQ 0x400 found 0x20 The I changed irq to 5 and got le0: lemac expected iomem at 0xd found 0x8 So I changed maddr to 0x8. But then I got a kernel panic. Any clues how to proceed to get this card working? -- Chris Christoph P. U. Kukulies [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: le0 - DE203 kernel config problem
On Wed, 2 Apr 2003, Christoph Kukulies wrote: I tried to get a DE203 NIC (ISA) working with 5.0R. ... Since I forgot how the card was programmed I tried and got it probed at io=0x200 ... Any clues how to proceed to get this card working? Find out what the configuration settings on the card are and use them. Trial and error is likely to produce the results you mentioned. -- | Matthew N. Dodd | '78 Datsun 280Z | '75 Volvo 164E | FreeBSD/NetBSD | | [EMAIL PROTECTED] | 2 x '84 Volvo 245DL| ix86,sparc,pmax | | http://www.jurai.net/~winter | For Great Justice! | ISO8802.5 4ever | ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: le0 - DE203 kernel config problem
On Wed, Apr 02, 2003 at 01:27:43PM -0500, Matthew N. Dodd wrote: On Wed, 2 Apr 2003, Christoph Kukulies wrote: I tried to get a DE203 NIC (ISA) working with 5.0R. ... Since I forgot how the card was programmed I tried and got it probed at io=0x200 ... Any clues how to proceed to get this card working? Find out what the configuration settings on the card are and use them. Trial and error is likely to produce the results you mentioned. Problem is that I probably don't have a configuration disk anymore. -- Chris Christoph P. U. Kukulies [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: le0 - DE203 kernel config problem
Christoph Kukulies wrote: On Wed, Apr 02, 2003 at 01:27:43PM -0500, Matthew N. Dodd wrote: On Wed, 2 Apr 2003, Christoph Kukulies wrote: I tried to get a DE203 NIC (ISA) working with 5.0R. ... Since I forgot how the card was programmed I tried and got it probed at io=0x200 ... Any clues how to proceed to get this card working? Find out what the configuration settings on the card are and use them. Trial and error is likely to produce the results you mentioned. Problem is that I probably don't have a configuration disk anymore. What would you give for one ? :-) Robert Swindells ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ports and /var/db/pkg
On Wed, Apr 02, 2003 at 06:18:24PM +0300, Danny Braniss wrote: hi all, is there some 'easy' way to resync /var/db/pkg from /usr/local (after some rm's on it?), i guess i could write a script to would try and match the info in /var/db/pkg, and if it's not where it's supposed to be would remove the info, but if there is a command ... Not really..you can go the other way though; see the example in pkg_which(1). Kris pgp0.pgp Description: PGP signature
Re: ports and /var/db/pkg
Danny, If you built your packages from ports, you could always reinstall them. You just have to check for /usr/ports/group/port/work/.install_done.* It's not perfect, but you could use it to generate a quick list of ports to selectively re-install. Good luck, -brian --- Kris Kennaway [EMAIL PROTECTED] wrote: On Wed, Apr 02, 2003 at 06:18:24PM +0300, Danny Braniss wrote: hi all, is there some 'easy' way to resync /var/db/pkg from /usr/local (after some rm's on it?), i guess i could write a script to would try and match the info in /var/db/pkg, and if it's not where it's supposed to be would remove the info, but if there is a command ... Not really..you can go the other way though; see the example in pkg_which(1). Kris ATTACHMENT part 2 application/pgp-signature __ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
suscribe
___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ports and /var/db/pkg
On Wed, Apr 02, 2003 at 02:58:43PM -0800, Brian O'Shea wrote: Danny, If you built your packages from ports, you could always reinstall them. You just have to check for /usr/ports/group/port/work/.install_done.* Only if you've never run 'make clean' (unlikely, if he's following directions). Kris pgp0.pgp Description: PGP signature