Re: 6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-20 Thread M. Warner Losh
In message: <[EMAIL PROTECTED]>
Kris Kennaway <[EMAIL PROTECTED]> writes:
: On Sun, Mar 19, 2006 at 04:39:19PM -0500, Matt Emmerton wrote:
: > On Sun, Mar 19, 2006 at 11:28:45AM -0500, Matt Emmerton wrote:
: OK, now you can post about your other panic :-)

Yes.  Please.  I'm interested in the ed0 panic, since this is the
first report I've had of problems with ed in a long time.

Warner
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-19 Thread Kris Kennaway
On Sun, Mar 19, 2006 at 04:39:19PM -0500, Matt Emmerton wrote:
> On Sun, Mar 19, 2006 at 11:28:45AM -0500, Matt Emmerton wrote:
> > [ Asked on -questions on Friday; re-asking now on -stable without
> > cross-post]
> >
> > I recently upgraded a 4.11-REL machine to 6.0-REL and have run into some
> > snags.  While the installation from CD went fine, after configuring and
> > enabling my ed0 NIC, bad things start to happen.
> >
> > FWIW, this machine is an ancient (hardware circa 1991, BIOS circa 1994)
> > dual-Pentium 133 MHz machine, with EISA/PCI and onboard SCSI.
> >
> > So far I can reliably reproduce two panics, one appears to be a ed driver
> > bug (based on reports of similar panics with different NICs, notably nge)
> > and one is a filesystem corruption problem.
> >
> > Here's the process that I go through to reliably reproduce both problems.
> > 1) Boot machine in multi-user mode
> > 2) After ifconfig ed0, machine panics with a trap 12 in ithread_loop.
> > 3) In debugger, reset (or panic to get vmcore)
> > 4) Reboot in multi-user mode, but set "hint.ed.0.disabled=1" in the boot
> > loader (to avoid ifconifg panic)
> > 5) Root filesystem is fsckd; all other filesystems are scheduled for
> > background fsck
> > 6) Encounter panic "ffs_valloc: dup alloc"
> 
> I think this part is because you have filesystem corruption from your
> previous panic.  Force a fsck in foreground mode and it should clear
> it up.
> 
> --
> 
> That prevents the FFS panic from occurring.  I had forgot about the fact
> that fsck in multi-user mode runs as "fsck -p" which only catches a limited
> subset of filesystem errors.

OK, now you can post about your other panic :-)

Kris


pgpWPMkQdXwwh.pgp
Description: PGP signature


Re: 6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-19 Thread Matt Emmerton
On Sun, Mar 19, 2006 at 11:28:45AM -0500, Matt Emmerton wrote:
> [ Asked on -questions on Friday; re-asking now on -stable without
> cross-post]
>
> I recently upgraded a 4.11-REL machine to 6.0-REL and have run into some
> snags.  While the installation from CD went fine, after configuring and
> enabling my ed0 NIC, bad things start to happen.
>
> FWIW, this machine is an ancient (hardware circa 1991, BIOS circa 1994)
> dual-Pentium 133 MHz machine, with EISA/PCI and onboard SCSI.
>
> So far I can reliably reproduce two panics, one appears to be a ed driver
> bug (based on reports of similar panics with different NICs, notably nge)
> and one is a filesystem corruption problem.
>
> Here's the process that I go through to reliably reproduce both problems.
> 1) Boot machine in multi-user mode
> 2) After ifconfig ed0, machine panics with a trap 12 in ithread_loop.
> 3) In debugger, reset (or panic to get vmcore)
> 4) Reboot in multi-user mode, but set "hint.ed.0.disabled=1" in the boot
> loader (to avoid ifconifg panic)
> 5) Root filesystem is fsckd; all other filesystems are scheduled for
> background fsck
> 6) Encounter panic "ffs_valloc: dup alloc"

I think this part is because you have filesystem corruption from your
previous panic.  Force a fsck in foreground mode and it should clear
it up.

--

That prevents the FFS panic from occurring.  I had forgot about the fact
that fsck in multi-user mode runs as "fsck -p" which only catches a limited
subset of filesystem errors.

--
Matt Emmerton

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-19 Thread Kris Kennaway
On Sun, Mar 19, 2006 at 11:28:45AM -0500, Matt Emmerton wrote:
> [ Asked on -questions on Friday; re-asking now on -stable without
> cross-post]
> 
> I recently upgraded a 4.11-REL machine to 6.0-REL and have run into some
> snags.  While the installation from CD went fine, after configuring and
> enabling my ed0 NIC, bad things start to happen.
> 
> FWIW, this machine is an ancient (hardware circa 1991, BIOS circa 1994)
> dual-Pentium 133 MHz machine, with EISA/PCI and onboard SCSI.
> 
> So far I can reliably reproduce two panics, one appears to be a ed driver
> bug (based on reports of similar panics with different NICs, notably nge)
> and one is a filesystem corruption problem.
> 
> Here's the process that I go through to reliably reproduce both problems.
> 1) Boot machine in multi-user mode
> 2) After ifconfig ed0, machine panics with a trap 12 in ithread_loop.
> 3) In debugger, reset (or panic to get vmcore)
> 4) Reboot in multi-user mode, but set "hint.ed.0.disabled=1" in the boot
> loader (to avoid ifconifg panic)
> 5) Root filesystem is fsckd; all other filesystems are scheduled for
> background fsck
> 6) Encounter panic "ffs_valloc: dup alloc"

I think this part is because you have filesystem corruption from your
previous panic.  Force a fsck in foreground mode and it should clear
it up.

Kris


pgpEEg65qKghF.pgp
Description: PGP signature


Re: 6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-19 Thread Fabian Keil
"Matt Emmerton" <[EMAIL PROTECTED]> wrote:

> I recently upgraded a 4.11-REL machine to 6.0-REL and have run into
> some snags.  While the installation from CD went fine, after
> configuring and enabling my ed0 NIC, bad things start to happen.
> 
> FWIW, this machine is an ancient (hardware circa 1991, BIOS circa
> 1994) dual-Pentium 133 MHz machine, with EISA/PCI and onboard SCSI.

At least it got "lots" of memory, last week I installed FreeBSD
6.1-PRERELEASE on a P90 with 16MB RAM.

> So far I can reliably reproduce two panics, one appears to be a ed
> driver bug (based on reports of similar panics with different NICs,
> notably nge) and one is a filesystem corruption problem.
> 
> Here's the process that I go through to reliably reproduce both
> problems. 1) Boot machine in multi-user mode
> 2) After ifconfig ed0, machine panics with a trap 12 in ithread_loop.
> 3) In debugger, reset (or panic to get vmcore)
> 4) Reboot in multi-user mode, but set "hint.ed.0.disabled=1" in the
> boot loader (to avoid ifconifg panic)
> 5) Root filesystem is fsckd; all other filesystems are scheduled for
> background fsck
> 6) Encounter panic "ffs_valloc: dup alloc"
> 7) In debugger, reset (or panic to get vmcore)

Did you try to do a foreground fsck in single user mode?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


6.0-REL problems with ISA ed0, FFS corruption and ancient hardware

2006-03-19 Thread Matt Emmerton
[ Asked on -questions on Friday; re-asking now on -stable without
cross-post]

I recently upgraded a 4.11-REL machine to 6.0-REL and have run into some
snags.  While the installation from CD went fine, after configuring and
enabling my ed0 NIC, bad things start to happen.

FWIW, this machine is an ancient (hardware circa 1991, BIOS circa 1994)
dual-Pentium 133 MHz machine, with EISA/PCI and onboard SCSI.

So far I can reliably reproduce two panics, one appears to be a ed driver
bug (based on reports of similar panics with different NICs, notably nge)
and one is a filesystem corruption problem.

Here's the process that I go through to reliably reproduce both problems.
1) Boot machine in multi-user mode
2) After ifconfig ed0, machine panics with a trap 12 in ithread_loop.
3) In debugger, reset (or panic to get vmcore)
4) Reboot in multi-user mode, but set "hint.ed.0.disabled=1" in the boot
loader (to avoid ifconifg panic)
5) Root filesystem is fsckd; all other filesystems are scheduled for
background fsck
6) Encounter panic "ffs_valloc: dup alloc"
7) In debugger, reset (or panic to get vmcore)

Attached is the full dmesg and stacktrace output from kgdb for the *second*
panic, since I figure this is the more critical issue.

--
Matt Emmerton
Script started on Sat Mar 18 12:58:13 2006
[EMAIL PROTECTED] kgdb /boot/kernel/kernel.debug vmcore.0
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.0-RELEASE #0: Sat Mar 18 12:00:50 EST 2006
[EMAIL PROTECTED]:/usr2/obj/usr2/src/sys/GABBY.20060316.01
MPTable: 
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Pentium/P54C (133.16-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x52c  Stepping = 12
  Features=0x3bf
real memory  = 50331648 (48 MB)
avail memory = 43941888 (41 MB)
Intel Pentium detected, installing workaround for F00F bug
ioapic0: Changing APIC ID to 2
ioapic0  irqs 0-15 on motherboard
npx0: [FAST]
npx0:  on motherboard
npx0: INT 16 interface
cpu0 on motherboard
pcib0:  pcibus 0 on motherboard
pci0:  on pcib0
eisab0:  at device 2.0 on pci0
eisa0:  on eisab0
mainboard0:  on eisa0 slot 0
isa0:  on eisab0
ahc0:  port 0xf800-0xf8ff mem 
0xffbef000-0xffbe irq 11 at device 11.0 on pci0
ahc0: [GIANT-LOCKED]
aic7870: Wide Channel A, SCSI Id=7, 16/253 SCBs
orm0:  at iomem 0xc-0xc7fff,0xc8000-0xca7ff on isa0
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  irq 1 on atkbdc0
atkbd0: [GIANT-LOCKED]
psm0:  irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model Generic PS/2 mouse, device ID 0
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0:  at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppbus0:  on ppc0
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources (irq)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (irq)
unknown:  can't assign resources (port)
Timecounter "TSC" frequency 133160146 Hz quality 800
Timecounters tick every 1.000 msec
Waiting 10 seconds for SCSI devices to settle
cd0 at ahc0 bus 0 target 4 lun 0
cd0:  Removable CD-ROM SCSI-2 device 
cd0: 10.000MB/s transfers (10.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present
da1 at ahc0 bus 0 target 5 lun 0
da1:  Fixed Direct Access SCSI-2 device 
da1: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled
da1: 2049MB (4197405 512 byte sectors: 64H 32S/T 2049C)
da0 at ahc0 bus 0 target 0 lun 0
da0:  Fixed Direct Access SCSI-2 device 
da0: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled
da0: 2049MB (4197405 512 byte sectors: 64H 32S/T 2049C)
Trying to mount root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
<118>Loading configuration files.
<118>kernel dumps on /dev/da0s1b
<118>Entropy harvesting:
<118>.
<118>swapon: adding /dev/da0s1b as swap device
<118>Starting file system checks:
<118>/dev/da0s1a: 1012 files,