date:20030402

Repeated similar panics on -STABLE

2003-04-02 Thread Dmitry Sivachenko

Hello!

We have three machines under relatively high load.  They are running -STABLE
on the same hardware with 2 processors (and SMP kernel).
Periodically (approximately once a week) they panic with similar symptoms:

# gdb -k kernel.debug vmcore.2
GNU gdb 4.18 (FreeBSD)
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-unknown-freebsd...Deprecated bfd_read called a
t /mnt/se3/releng_4/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbx
read.c line 2627 in elfstab_build_psymtabs
Deprecated bfd_read called at /mnt/se3/releng_4/src/gnu/usr.bin/binutils/gdb/../
../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
SMP 2 cpus
IdlePTD at phsyical address 0x0034f000
initial pcb at physical address 0x002bd6a0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 0102; cpuid = 1; lapic.id = 
fault virtual address   = 0x5cdd8000
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc015daff
stack pointer   = 0x10:0xeb278e44
frame pointer   = 0x10:0xeb278e68
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 65648 (cronolog)
interrupt mask  = net tty bio cam  - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0102; cpuid = 1; lapic.id = 
boot() called on cpu#1
syncing disks...
Fatal trap 12: page fault while in kernel mode
mp_lock = 0103; cpuid = 1; lapic.id = 
fault virtual address   = 0x5cdd8000
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc015daff
stack pointer   = 0x10:0xeb278b68
frame pointer   = 0x10:0xeb278b8c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 65648 (cronolog)
interrupt mask  = net tty bio cam  - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0103; cpuid = 1; lapic.id = 
boot() called on cpu#1
Uptime: 5d0h48m54s
dumping to dev #da/0x20001, offset 2097280
dump 1023 1022 1021 1020 1019 1018 1017 1016 1015 1014 1013 1012 1011 1010 1009
snip
---
#0  dumpsys () at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:487
487 if (dumping++) {
(kgdb) bt
#0  dumpsys () at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:487
#1  0xc01620c6 in boot (howto=260)
at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:316
#2  0xc0162549 in panic (fmt=0xc028e3b9 %s)
at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:595
#3  0xc0251b1a in trap_fatal (frame=0xeb278b28, eva=1558020096)
at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:974
#4  0xc0251775 in trap_pfault (frame=0xeb278b28, usermode=0, eva=1558020096)
at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:867
#5  0xc02512b7 in trap (frame={tf_fs = -65512, tf_es = -941031408,
  tf_ds = -942997488, tf_edi = -1070937504, tf_esi = -730301488,
  tf_ebp = -349729908, tf_isp = -349729964, tf_ebx = -1070870564,
  tf_edx = 1558020096, tf_ecx = 7, tf_eax = 128, tf_trapno = 12,
  tf_err = 0, tf_eip = -1072309505, tf_cs = 8, tf_eflags = 66054,
  tf_esp = 33281, tf_ss = -730301488})
at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:466
#6  0xc015daff in malloc (size=128, type=0xc02aca60, flags=2)
at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243
#7  0xc02085e1 in initiate_write_inodeblock (inodedep=0xc8e69400,
bp=0xd4787bd0) at /mnt/se3/releng_4/src/sys/ufs/ffs/ffs_softdep.c:3091
#8  0xc02083b3 in softdep_disk_io_initiation (bp=0xd4787bd0)
at /mnt/se3/releng_4/src/sys/ufs/ffs/ffs_softdep.c:2965
#9  0xc019d51a in spec_strategy (ap=0xeb278c0c)
at /mnt/se3/releng_4/src/sys/miscfs/specfs/spec_vnops.c:453
#10 0xc0188cab in bwrite (bp=0xd4787bd0) at vnode_if.h:944
#11 0xc018e98f in vop_stdbwrite (ap=0xeb278c6c)
at /mnt/se3/releng_4/src/sys/kern/vfs_default.c:344
#12 0xc018e791 in vop_defaultop (ap=0xeb278c6c)
at /mnt/se3/releng_4/src/sys/kern/vfs_default.c:152
#13 0xc0189ce5 in vfs_bio_awrite (bp=0xd4787bd0) at vnode_if.h:1193
#14 0xc019d33f in spec_fsync (ap=0xeb278cd4)
at /mnt/se3/releng_4/src/sys/miscfs/specfs/spec_vnops.c:391
#15 0xc020ca4d in ffs_sync (mp=0xc7ea1a00, waitfor=2, cred=0xc1c6e900,
p=0xc02d25e0) at vnode_if.h:558
#16 0xc01941b7 in sync (p=0xc02d25e0, uap=0x0)
at /mnt/se3/releng_4/src/sys/kern/vfs_syscalls.c:576
#17 0xc0161e7c in boot (howto=256)
at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:235
#18 0xc0162549 in

Re: Repeated similar panics on -STABLE

2003-04-02 Thread Terry Lambert

Dmitry Sivachenko wrote:
 We have three machines under relatively high load.  They are running -STABLE
 on the same hardware with 2 processors (and SMP kernel).
 Periodically (approximately once a week) they panic with similar symptoms:

[ ... ]

Panic.

 #18 0xc0162549 in panic (fmt=0xc028e3b9 %s)
 at /mnt/se3/releng_4/src/sys/kern/kern_shutdown.c:595
 #19 0xc0251b1a in trap_fatal (frame=0xeb278e04, eva=1558020096)
 at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:974
 #20 0xc0251775 in trap_pfault (frame=0xeb278e04, usermode=0, eva=1558020096)
 at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:867
 #21 0xc02512b7 in trap (frame={tf_fs = -107238, tf_es = -361627632,
   tf_ds = 16, tf_edi = -1070989600, tf_esi = -349729108,
   tf_ebp = -349729176, tf_isp = -349729232, tf_ebx = -1070870564,
   tf_edx = 1558020096, tf_ecx = 7, tf_eax = 128, tf_trapno = 12,
   tf_err = 0, tf_eip = -1072309505, tf_cs = 8, tf_eflags = 66054,
   tf_esp = 0, tf_ss = -349729108})
 at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:466

Page not present error.


 #22 0xc015daff in malloc (size=72, type=0xc029fee0, flags=0)
 at /mnt/se3/releng_4/src/sys/kern/kern_malloc.c:243

Malloc failure was not checked for return value by source code;
probably the kbp list was just refreshed, and while you were
calling the failing malloc, the list was reemptied.

What this generally means is that KVA was exhausted, and the
caller did not expect that.

To workaround: don't exhaust the KVA space; probably you have tuned
some kernel parameter way too high.

To fix: at line 243, you need to check if va is NULL; if it is,
you need to wheck the M_WAITOK, and if set, restart the allocation.
This has to be done before the next line, where va is dereferenced.

Maybe something like:

Change:
va = kbp-kb_next;
kbp-kb_next = ((struct freelist *)va)-next;

To:

va = kbp-kb_next;
if (va == NULL) {
if (flags  M_NOWAIT) {
splx(s);
return ((void *) NULL);
}
goto restart;   /* put this label above the while */
}
kbp-kb_next = ((struct freelist *)va)-next;

Working around the problem is easier (IMO): just change your tuning
parameters to avoid running out of KVA.  Probably your mbufs or
mbufclusters are way to large, for your amount of physical RAM;
remember that, except in very sepcial circumstances, kernel memory
is non-pageable.


 #23 0xc015a3fe in exit1 (p=0xea726820, rv=15)
 at /mnt/se3/releng_4/src/sys/kern/kern_exit.c:166

It was trying to allocate a zombie structure.


 #24 0xc0164011 in sigexit (p=0xea726820, sig=15)
 at /mnt/se3/releng_4/src/sys/kern/kern_sig.c:1503

For a process someone sent a SIGTERM to, to kill it.


 #25 0xc0163d9c in postsig (sig=15)
 at /mnt/se3/releng_4/src/sys/kern/kern_sig.c:1406
 #26 0xc0251fc5 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
   tf_edi = 174, tf_esi = 1049187701, tf_ebp = -1077936960,
   tf_isp = -349728812, tf_ebx = 1, tf_edx = 3, tf_ecx = -1078002496,
   tf_eax = 3, tf_trapno = 7, tf_err = 2, tf_eip = 672039098, tf_cs = 31,
   tf_eflags = 659, tf_esp = -1078069180, tf_ss = 47})
 at /mnt/se3/releng_4/src/sys/i386/i386/trap.c:174

Looks like you caused a floating point exception, and died when
the exit1 failed to create a zombie structure for the process.

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Repeated similar panics on -STABLE

2003-04-02 Thread Dmitry Sivachenko

On Wed, Apr 02, 2003 at 05:44:28PM +0400, Dmitry Sivachenko wrote:
 Hello!
 
snip

 Fatal trap 12: page fault while in kernel mode
 mp_lock = 0102; cpuid = 1; lapic.id = 
 fault virtual address   = 0x5cdd8000
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0xc015daff


BTW,

(kgdb) list *0xc015daff
0xc015daff is in malloc (/mnt/se3/releng_4/src/sys/kern/kern_malloc.c:244).
239 freep-next = savedlist;
240 if (kbp-kb_last == NULL)
241 kbp-kb_last = (caddr_t)freep;
242 }
243 va = kbp-kb_next;
244 kbp-kb_next = ((struct freelist *)va)-next;
245 #ifdef INVARIANTS
246 freep = (struct freelist *)va;
247 savedtype = (const char *) freep-type-ks_shortdesc;
248 #if BYTE_ORDER == BIG_ENDIAN
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

ports and /var/db/pkg

2003-04-02 Thread Danny Braniss

hi all,
is there some 'easy' way to resync /var/db/pkg from /usr/local
(after some rm's on it?), i guess i could write a script to would try and
match the info in /var/db/pkg, and if it's not where it's supposed to be
would remove the info, but if there is a command ...

thanks,
danny


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Repeated similar panics on -STABLE

2003-04-02 Thread Terry Lambert

Dmitry Sivachenko wrote:
 On Wed, Apr 02, 2003 at 05:44:28PM +0400, Dmitry Sivachenko wrote:
  Hello!
 
 snip
 
  Fatal trap 12: page fault while in kernel mode
  mp_lock = 0102; cpuid = 1; lapic.id = 
  fault virtual address   = 0x5cdd8000
  fault code  = supervisor read, page not present
  instruction pointer = 0x8:0xc015daff
 
 BTW,
 
 (kgdb) list *0xc015daff
 0xc015daff is in malloc (/mnt/se3/releng_4/src/sys/kern/kern_malloc.c:244).
 243 va = kbp-kb_next;
 244 kbp-kb_next = ((struct freelist *)va)-next;

Yes, I know.  See analysis and patch and workaround.

-- Terry
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Returned mail--NOSHADE CLASS

2003-04-02 Thread postmaster


   The following mail can't be sent to [EMAIL PROTECTED]:
   From: [EMAIL PROTECTED]
   To: [EMAIL PROTECTED]
   Subject: NOSHADE CLASS
   The attachment is the original mail
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

le0 - DE203 kernel config problem

2003-04-02 Thread Christoph Kukulies

I tried to get a DE203 NIC (ISA) working with 5.0R.

Took the GENERIC config file
and put 

device le  1

options COMPAT_OLDISA 

in it.

Since I forgot how the card was programmed I tried and got it 
probed at io=0x200

so I put the following in /boot/device.hints

hint.le.0.at=isa 
hint.le.0.disabled=0
hint.le.0.port=0x200  
hint.le.0.irq=10
hint.le.0.maddr=0xd

and first got an error during probe, something like

le0: lemac expected IRQ 0x400 found 0x20

The I changed irq to 5 and got

le0: lemac expected iomem at 0xd found 0x8

So I changed maddr to 0x8.

But then I got a kernel panic.


Any clues how to proceed to get this card working?

--
Chris Christoph P. U. Kukulies [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: le0 - DE203 kernel config problem

2003-04-02 Thread Matthew N. Dodd

On Wed, 2 Apr 2003, Christoph Kukulies wrote:
 I tried to get a DE203 NIC (ISA) working with 5.0R.
...
 Since I forgot how the card was programmed I tried and got it
 probed at io=0x200
...
 Any clues how to proceed to get this card working?

Find out what the configuration settings on the card are and use them.

Trial and error is likely to produce the results you mentioned.

-- 
| Matthew N. Dodd  | '78 Datsun 280Z | '75 Volvo 164E | FreeBSD/NetBSD  |
| [EMAIL PROTECTED] |   2 x '84 Volvo 245DL| ix86,sparc,pmax |
| http://www.jurai.net/~winter |  For Great Justice!  | ISO8802.5 4ever |
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: le0 - DE203 kernel config problem

2003-04-02 Thread Christoph P. Kukulies

On Wed, Apr 02, 2003 at 01:27:43PM -0500, Matthew N. Dodd wrote:
 On Wed, 2 Apr 2003, Christoph Kukulies wrote:
  I tried to get a DE203 NIC (ISA) working with 5.0R.
 ...
  Since I forgot how the card was programmed I tried and got it
  probed at io=0x200
 ...
  Any clues how to proceed to get this card working?
 
 Find out what the configuration settings on the card are and use them.
 
 Trial and error is likely to produce the results you mentioned.

Problem is that I probably don't have a configuration disk anymore.

--
Chris Christoph P. U. Kukulies [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: le0 - DE203 kernel config problem

2003-04-02 Thread Robert Swindells


Christoph Kukulies wrote:
On Wed, Apr 02, 2003 at 01:27:43PM -0500, Matthew N. Dodd wrote:
 On Wed, 2 Apr 2003, Christoph Kukulies wrote:
  I tried to get a DE203 NIC (ISA) working with 5.0R.
 ...
  Since I forgot how the card was programmed I tried and got it
  probed at io=0x200
 ...
  Any clues how to proceed to get this card working?
 
 Find out what the configuration settings on the card are and use them.
 
 Trial and error is likely to produce the results you mentioned.

Problem is that I probably don't have a configuration disk anymore.

What would you give for one ? :-)

Robert Swindells
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ports and /var/db/pkg

2003-04-02 Thread Kris Kennaway

On Wed, Apr 02, 2003 at 06:18:24PM +0300, Danny Braniss wrote:
 hi all,
   is there some 'easy' way to resync /var/db/pkg from /usr/local
 (after some rm's on it?), i guess i could write a script to would try and
 match the info in /var/db/pkg, and if it's not where it's supposed to be
 would remove the info, but if there is a command ...

Not really..you can go the other way though; see the example in
pkg_which(1).

Kris


pgp0.pgp
Description: PGP signature

Re: ports and /var/db/pkg

2003-04-02 Thread Brian O'Shea

Danny,

If you built your packages from ports, you could always reinstall them.
You just have to check for /usr/ports/group/port/work/.install_done.*

It's not perfect, but you could use it to generate a quick list of ports
to selectively re-install.

Good luck,
-brian

--- Kris Kennaway [EMAIL PROTECTED] wrote:
 On Wed, Apr 02, 2003 at 06:18:24PM +0300, Danny Braniss wrote:
  hi all,
  is there some 'easy' way to resync /var/db/pkg from /usr/local
  (after some rm's on it?), i guess i could write a script to would try and
  match the info in /var/db/pkg, and if it's not where it's supposed to be
  would remove the info, but if there is a command ...
 
 Not really..you can go the other way though; see the example in
 pkg_which(1).
 
 Kris
 

 ATTACHMENT part 2 application/pgp-signature 



__
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ports and /var/db/pkg

2003-04-02 Thread Kris Kennaway

On Wed, Apr 02, 2003 at 02:58:43PM -0800, Brian O'Shea wrote:
 Danny,
 
 If you built your packages from ports, you could always reinstall them.
 You just have to check for /usr/ports/group/port/work/.install_done.*

Only if you've never run 'make clean' (unlikely, if he's following
directions).

Kris


pgp0.pgp
Description: PGP signature

Repeated similar panics on -STABLE

Re: Repeated similar panics on -STABLE

Re: Repeated similar panics on -STABLE

ports and /var/db/pkg

Re: Repeated similar panics on -STABLE

Returned mail--NOSHADE CLASS

le0 - DE203 kernel config problem

Re: le0 - DE203 kernel config problem

Re: le0 - DE203 kernel config problem

Re: le0 - DE203 kernel config problem

Re: ports and /var/db/pkg

Re: ports and /var/db/pkg

suscribe

Re: ports and /var/db/pkg

14 matches

Site Navigation

Mail list logo

Footer information