Re: more on supermicro 6010H hang

2001-07-18 Thread Dave Cornejo

Well, damn.

I've run it through a couple of reboots with a fresh current SMP
kernel and it boots like a champ... 

Where were you a couple of weeks ago when I started trying to solve
this problem? ;-)  At least I learned a lot about debugging kernels...

thanks for your guess!

dave c


> This is probably not it, but it's worth a peek.  Check your BIOS
> settings and see if there's one that controls whether the USB
> interrupt is enabled.  Make sure that this interrupt is enabled.  If
> it's not, I know you can get hangs at exactly the point where the
> "Waiting 15 seconds.." message comes out.

-- 
Dave Cornejo @ Dogwood Media, Fremont, California (also [EMAIL PROTECTED])
  "There aren't any monkeys chasing us..." - Xochi

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: more on supermicro 6010H hang

2001-07-18 Thread John Polstra

In article <[EMAIL PROTECTED]>,
Dave Cornejo  <[EMAIL PROTECTED]> wrote:
> I have isolated the point at which current no longer runs as Jan 31 -
> Feb 1 of this year.  Prior version work fine, in Feb & Mar I get
> either "Kernel trap 9 with interrupts disabled" or I think the same
> thing with trap 26 (really not sure on that one).
> 
> Next I took a brand new current from this evening and tried it - it
> still hangs, but a keypress on the keyboard pretty much always breaks
> it out of the hang and into a normal boot.
> 
> Now, I finally got the equipment and time together to remote gdb the
> bad kernel and here's what I get:
> 
> I set a breakpoint at cam_xpt.c::xpt_config() - this is where the
> "Waiting 15 seconds.." message is from and stepped down through it.  I
> get through the first xpt_for_all_busses (xptconfigbuscountfunc,...)
> and then I hit the second one (~line 6749 of cam_xpt.c) I pass through
> several things, including the xptconfigfunc() and end up in
> subr_autoconf.c::run_interrupt_driven_config_hooks().  At the bottom
> of this function there is a tsleep that gets called - this is
> apparently where it hangs.  If I hit a key on the keyboard it will
> continue on past this point and all seems to work fine from then on.

This is probably not it, but it's worth a peek.  Check your BIOS
settings and see if there's one that controls whether the USB
interrupt is enabled.  Make sure that this interrupt is enabled.  If
it's not, I know you can get hangs at exactly the point where the
"Waiting 15 seconds.." message comes out.

John
-- 
  John Polstra   [EMAIL PROTECTED]
  John D. Polstra & Co., Inc.Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



more on supermicro 6010H hang

2001-07-16 Thread Dave Cornejo

I have isolated the point at which current no longer runs as Jan 31 -
Feb 1 of this year.  Prior version work fine, in Feb & Mar I get
either "Kernel trap 9 with interrupts disabled" or I think the same
thing with trap 26 (really not sure on that one).

Next I took a brand new current from this evening and tried it - it
still hangs, but a keypress on the keyboard pretty much always breaks
it out of the hang and into a normal boot.

Now, I finally got the equipment and time together to remote gdb the
bad kernel and here's what I get:

I set a breakpoint at cam_xpt.c::xpt_config() - this is where the
"Waiting 15 seconds.." message is from and stepped down through it.  I
get through the first xpt_for_all_busses (xptconfigbuscountfunc,...)
and then I hit the second one (~line 6749 of cam_xpt.c) I pass through
several things, including the xptconfigfunc() and end up in
subr_autoconf.c::run_interrupt_driven_config_hooks().  At the bottom
of this function there is a tsleep that gets called - this is
apparently where it hangs.  If I hit a key on the keyboard it will
continue on past this point and all seems to work fine from then on.

This is my first time this deep into the kernel - can you suggest a
further plan of attack?

thanks!
dave c


here's the dmesg output for this system if this helps any:


Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Mon Jul 16 22:32:23 PDT 2001
[EMAIL PROTECTED]:/usr/src/sys/i386/compile/SMP
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (999.53-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x686  Stepping = 6
  
Features=0x383fbff
real memory  = 1073676288 (1048512K bytes)
avail memory = 1040248832 (1015868K bytes)
Programming 16 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
Programming 16 pins in IOAPIC #1
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  4, version: 0x000f0011, at 0xfec0
 io1 (APIC): apic id:  5, version: 0x000f0011, at 0xfec01000
Preloaded elf kernel "kernel" at 0xc0527000.
Pentium Pro MTRR support enabled
WARNING: Driver mistake: destroy_dev on 154/0
Using $PIR table, 7 entries at 0xc00f5370
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  at pcibus 0 on motherboard
IOAPIC #1 intpin 12 -> irq 2
IOAPIC #1 intpin 10 -> irq 5
IOAPIC #1 intpin 11 -> irq 7
IOAPIC #1 intpin 15 -> irq 9
pci0:  on pcib0
pcib1:  at device 0.1 on pci0
IOAPIC #1 intpin 14 -> irq 11
pci1:  on pcib1
pci1:  at 0.0 (no driver attached)
fxp0:  port 0xc800-0xc83f mem 
0xfe80-0xfe8f,0xfeafb000-0xfeafbfff irq 2 at device 4.0 on pci0
fxp0: Ethernet address 00:30:48:11:69:84
inphy0:  on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0:  port 0xd000-0xd0ff mem 
0xfeafc000-0xfeafcfff irq 5 at device 5.0 on pci0
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/255 SCBs
ahc1:  port 0xd800-0xd8ff mem 
0xfeaff000-0xfeaf irq 7 at device 5.1 on pci0
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/255 SCBs
fxp1:  port 0xd400-0xd43f mem 
0xfe90-0xfe9f,0xfeafd000-0xfeafdfff irq 9 at device 6.0 on pci0
fxp1: Ethernet address 00:30:48:11:6e:27
inphy1:  on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isab0:  port 0x580-0x58f at device 15.0 on pci0
isa0:  on isab0
atapci0:  port 0xffa0-0xffaf at device 15.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
ohci0:  mem 0xfeafe000-0xfeafefff irq 10 at device 15.2 
on pci0
usb0: OHCI version 1.0, legacy support
usb0:  on ohci0
usb0: USB revision 1.0
uhub0: (unknown) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pcib2:  at pcibus 2 on motherboard
pci2:  on pcib2
orm0:  at iomem 
0xc-0xc7fff,0xc8000-0xc8fff,0xc9000-0xcefff,0xcf000-0xc on isa0
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: parallel port not found.
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x80 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2
APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
acd0: CDROM  at ata1-master PIO4
Waiting 2 seconds for SCSI devices to settle
da0 at ahc0 bus 0 target 0 lun 0
da0:  Fi