Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems

2012-02-23 Thread YongHyeon PYUN
On Thu, Feb 23, 2012 at 07:41:25AM +0100, Attila Nagy wrote:
> On 02/23/12 21:44, YongHyeon PYUN wrote:
> >I have to ask more information for the controller to Broadcom.
> >Not sure whether I can get some hint at this moment though. :-(
> Is there anything I can do? I ask this because I have to give back this 
> server very soon.
> >
> >Given that you also have USB related errors, could you completely
> >remove bge(4) in your kernel and see whether it can successfully
> >boot up?
> >I think you can add the following entries to /boot/device.hints
> >without rebuilding kernel.
> >
> >hint.bge.0.disabled="1"
> >hint.bge.1.disabled="1"
> >hint.bge.2.disabled="1"
> >hint.bge.3.disabled="1"
> This does not help.
> Removing bge makes it stop here:
> da0 at ciss0 bus 0 scbus0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-5 device
> da0: 135.168MB/s transfers
> da0: Command Queueing enabled
> da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
> panic: bootpc_init: no eligible interfaces
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> bootpc_init() at bootpc_init+0x1205
> mi_startup() at mi_startup+0x77
> btext() at btext+0x2c
> KDB: enter: panic
> [ thread pid 0 tid 10 ]
> Stopped at  kdb_enter+0x3b: movq$0,0x976972(%rip)
> db>
> 
> Which is completely OK, because there are really no interfaces to boot 
> from. Note that there is no NMI either (maybe because it would happen 
> later in the initialization process).
> Sadly, I can't boot from disk, but I assume it would work.

Ok, I guess you're seeing similar issue that Sean reported.  I'll
let you when I have experimental patch.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems

2012-02-22 Thread Attila Nagy

On 02/23/12 21:44, YongHyeon PYUN wrote:

I have to ask more information for the controller to Broadcom.
Not sure whether I can get some hint at this moment though. :-(
Is there anything I can do? I ask this because I have to give back this 
server very soon.


Given that you also have USB related errors, could you completely
remove bge(4) in your kernel and see whether it can successfully
boot up?
I think you can add the following entries to /boot/device.hints
without rebuilding kernel.

hint.bge.0.disabled="1"
hint.bge.1.disabled="1"
hint.bge.2.disabled="1"
hint.bge.3.disabled="1"

This does not help.
Removing bge makes it stop here:
da0 at ciss0 bus 0 scbus0 target 0 lun 0
da0:  Fixed Direct Access SCSI-5 device
da0: 135.168MB/s transfers
da0: Command Queueing enabled
da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
panic: bootpc_init: no eligible interfaces
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
bootpc_init() at bootpc_init+0x1205
mi_startup() at mi_startup+0x77
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 10 ]
Stopped at  kdb_enter+0x3b: movq$0,0x976972(%rip)
db>

Which is completely OK, because there are really no interfaces to boot 
from. Note that there is no NMI either (maybe because it would happen 
later in the initialization process).

Sadly, I can't boot from disk, but I assume it would work.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems

2012-02-22 Thread YongHyeon PYUN
On Wed, Feb 22, 2012 at 03:43:54PM +0100, Attila Nagy wrote:
> On 02/23/12 05:15, YongHyeon PYUN wrote:
> >bge0:  mem
> >0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32
> >at device 0.0 on pci3
> >bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
> > ^^
> >
> >This controller is new one. Probably BCM5719 A1 but not sure.
> Yes, it's in a new machine.
> 
> >
> >>bge0: Try again
> >This message indicates your controller has ASF/IPMI firmware.
> >Try disabling ASF and see whether it makes any difference.
> >(Change hw.bge.allow_asf tunable to 0).
> Oh, I always forget that (on the other machines this is set).
> This is what I get with
> machdep.panic_on_nmi: 0
> machdep.kdb_on_nmi: 0
> hw.bge.allow_asf: 0
> 

I have to ask more information for the controller to Broadcom.
Not sure whether I can get some hint at this moment though. :-(

Given that you also have USB related errors, could you completely
remove bge(4) in your kernel and see whether it can successfully
boot up?
I think you can add the following entries to /boot/device.hints
without rebuilding kernel.

hint.bge.0.disabled="1"
hint.bge.1.disabled="1"
hint.bge.2.disabled="1"
hint.bge.3.disabled="1"

> bge0:  mem 
> 0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 
> at device 0.0 on pci3
> bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
> bge0: Try again
> miibus0:  on bge0
> ukphy0:  PHY 1 on miibus0
> ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
> auto-flow
> bge0: Ethernet address: 3c:4a:92:b2:3c:08
> pci0:3:0:1: failed to read VPD data.
> bge1:  mem 
> 0xf6bc-0xf6bc,0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 
> at device 0.1 on pci3
> bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
> miibus1:  on bge1
> brgphy0:  PHY 2 on miibus1
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge1: Ethernet address: 3c:4a:92:b2:3c:09
> pci0:3:0:2: failed to read VPD data.
> bge2:  mem 
> 0xf6b9-0xf6b9,0xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 
> at device 0.2 on pci3
> bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
> miibus2:  on bge2
> brgphy1:  PHY 3 on miibus2
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge2: Ethernet address: 3c:4a:92:b2:3c:0a
> pci0:3:0:3: failed to read VPD data.
> bge3:  mem 
> 0xf6b6-0xf6b6,0xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 
> at device 0.3 on pci3
> bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
> miibus3:  on bge3
> brgphy2:  PHY 4 on miibus3
> brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge3: Ethernet address: 3c:4a:92:b2:3c:0b
> [...]
> da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
> NMI ISA 60, EISA ff
> I/O channel check, likely hardware failure.Sending DHCP Discover packet 
> from interface bge0 (3c:4a:92:b2:3c:08)
> cd0 at ata3 bus 0 scbus3 target 0 lun 0
> cd0:  Removable CD-ROM SCSI-0 device
> cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
> cd0: Attempt to query device size failed: NOT READY, Medium not present 
> - tray closed
> bge0: 11 link states coalesced
> bge0: link state changed to DOWN
> ugen0.2:  at usbus0
> uhub3:  
> on usbus0
> bge1: 5 link states coalesced
> bge1: link state changed to DOWN
> bge2: link state changed to DOWN
> bge3: link state changed to DOWN
> bge0: ugen2.2:  at usbus2
> uhub4:  
> on usbus2
> 2 link states coalesced
> bge0: link state changed to DOWN
> bge1: 4 link states coalesced
> bge1: link state changed to DOWN
> bge0: 4 link states coalesced
> bge0: link state changed to DOWN
> Sending DHCP Discover packet from interface bge1 (3c:4a:92:b2:3c:09)
> uhub3: 6 ports with 6 removable, self powered
> bge0: usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored)
> 6 link states coalesced
> bge0: link state changed to DOWN
> bge1: 2 link states coalesced
> bge1: link state changed to DOWN
> Sending DHCP Discover packet from interface bge2 (3c:4a:92:b2:3c:0a)
> bge0: 2 link states coalesced
> bge0: link state changed to DOWN
> bge1: usbd_setup_device_desc: getting device descriptor at addr 2 
> failed, USB_ERR_TIMEOUT
> 10 link states coalesced
> bge1: link state changed to DOWN
> uhub4: 8 ports with 8 removable, self powered
> bge0: 4 link states coalesced
> bge0: link state changed to DOWN
> bge1: 2 link states coalesced
> bge1: link state changed to DOWN
> Sending DHCP Discover packet from interface bge3 (3c:4a:92:b2:3c:0b)
> bge0: 2 link states coalesced
> bge0: link state changed to DOWN
> bge1: 2 link states coalesced
> bge1: li

Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems

2012-02-22 Thread Attila Nagy

On 02/23/12 05:15, YongHyeon PYUN wrote:

bge0:  mem
0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32
at device 0.0 on pci3
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
 ^^

This controller is new one. Probably BCM5719 A1 but not sure.

Yes, it's in a new machine.




bge0: Try again

This message indicates your controller has ASF/IPMI firmware.
Try disabling ASF and see whether it makes any difference.
(Change hw.bge.allow_asf tunable to 0).

Oh, I always forget that (on the other machines this is set).
This is what I get with
machdep.panic_on_nmi: 0
machdep.kdb_on_nmi: 0
hw.bge.allow_asf: 0

bge0:  mem 
0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 
at device 0.0 on pci3

bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
bge0: Try again
miibus0:  on bge0
ukphy0:  PHY 1 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
auto-flow

bge0: Ethernet address: 3c:4a:92:b2:3c:08
pci0:3:0:1: failed to read VPD data.
bge1:  mem 
0xf6bc-0xf6bc,0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 
at device 0.1 on pci3

bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1:  on bge1
brgphy0:  PHY 2 on miibus1
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow

bge1: Ethernet address: 3c:4a:92:b2:3c:09
pci0:3:0:2: failed to read VPD data.
bge2:  mem 
0xf6b9-0xf6b9,0xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 
at device 0.2 on pci3

bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2:  on bge2
brgphy1:  PHY 3 on miibus2
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow

bge2: Ethernet address: 3c:4a:92:b2:3c:0a
pci0:3:0:3: failed to read VPD data.
bge3:  mem 
0xf6b6-0xf6b6,0xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 
at device 0.3 on pci3

bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3:  on bge3
brgphy2:  PHY 4 on miibus3
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow

bge3: Ethernet address: 3c:4a:92:b2:3c:0b
[...]
da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
NMI ISA 60, EISA ff
I/O channel check, likely hardware failure.Sending DHCP Discover packet 
from interface bge0 (3c:4a:92:b2:3c:08)

cd0 at ata3 bus 0 scbus3 target 0 lun 0
cd0:  Removable CD-ROM SCSI-0 device
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present 
- tray closed

bge0: 11 link states coalesced
bge0: link state changed to DOWN
ugen0.2:  at usbus0
uhub3:  
on usbus0

bge1: 5 link states coalesced
bge1: link state changed to DOWN
bge2: link state changed to DOWN
bge3: link state changed to DOWN
bge0: ugen2.2:  at usbus2
uhub4:  
on usbus2

2 link states coalesced
bge0: link state changed to DOWN
bge1: 4 link states coalesced
bge1: link state changed to DOWN
bge0: 4 link states coalesced
bge0: link state changed to DOWN
Sending DHCP Discover packet from interface bge1 (3c:4a:92:b2:3c:09)
uhub3: 6 ports with 6 removable, self powered
bge0: usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored)
6 link states coalesced
bge0: link state changed to DOWN
bge1: 2 link states coalesced
bge1: link state changed to DOWN
Sending DHCP Discover packet from interface bge2 (3c:4a:92:b2:3c:0a)
bge0: 2 link states coalesced
bge0: link state changed to DOWN
bge1: usbd_setup_device_desc: getting device descriptor at addr 2 
failed, USB_ERR_TIMEOUT

10 link states coalesced
bge1: link state changed to DOWN
uhub4: 8 ports with 8 removable, self powered
bge0: 4 link states coalesced
bge0: link state changed to DOWN
bge1: 2 link states coalesced
bge1: link state changed to DOWN
Sending DHCP Discover packet from interface bge3 (3c:4a:92:b2:3c:0b)
bge0: 2 link states coalesced
bge0: link state changed to DOWN
bge1: 2 link states coalesced
bge1: link state changed to DOWN
ugen2.3:  at usbus2
uhub5:  
on usbus2

bge0: watchdog timeout -- resetting
bge0: link state changed to UP
usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, 
ignored)

bge1: 2 link states coalesced
bge1: link state changed to DOWN
uhub5: 2 ports with 1 removable, self powered
bge0: link state changed to DOWN
bge1: watchdog timeout -- resetting
bge1: usbd_setup_device_desc: getting device descriptor at addr 2 
failed, USB_ERR_TIMEOUT

2 link states coalesced
bge1: link state changed to DOWN
bge0: 2 link states coalesced
bge0: link state changed to DOWN
ugen1.2:  at usbus1
ukbd0:  on usbus1
kbd2 at ukbd0
ums0:  on usbus1
bge1: 4 link states coalesced
bge1: link state changed to DOWN
bge0: 

Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems

2012-02-22 Thread YongHyeon PYUN
On Wed, Feb 22, 2012 at 08:49:31AM +0100, Attila Nagy wrote:
> Hi,
> 
> I get this on a recent stable/9 system with uhci support removed from 
> the kernel config:
> da0 at ciss0 bus 0 scbus0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-5 device
> da0: 135.168MB/s transfers
> da0: Command Queueing enabled
> da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
> cd0 at ata3 bus 0 scbus3 target 0 lun 0
> cd0:  Removable CD-ROM SCSI-0 device
> cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
> cd0: Attempt to query device size failed: NOT READY, Medium not present 
> - tray closed
> NMI ISA 70, EISA ff
> I/O channel check, likely hardware failure.
> 
> Fatal trap 19: non-maskable interrupt trap while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer = 0x20:0x804543fb
> stack pointer   = 0x28:0x81251e40
> frame pointer   = 0x28:0x814cf660
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, IOPL = 0
> current process = 0 (swapper)
> [ thread pid 0 tid 10 ]
> Stopped at  bge_init_locked+0x233b: movl0x81c(%rsi),%eax
> db>
> 
> and this with a plain GENERIC kernel:
> da0 at ciss0 bus 0 scbus0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-5 device
> da0: 135.168MB/s transfers
> da0: Command Queueing enabled
> da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C)
> cd0 at ata3 bus 0 scbus3 target 0 lun 0
> cd0:  Removable CD-ROM SCSI-0 device
> cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
> cd0: Attempt to query device size failed: NOT READY, Medium not present 
> - tray closed
> NMI ISA 70, EISA ff
> I/O channel check, likely hardware failure.
> 
> Fatal trap 19: non-maskable interrupt trap while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer = 0x20:0x80711dc5
> stack pointer   = 0x28:0x81272040
> frame pointer   = 0x28:0xff907cf44b40
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, IOPL = 0
> current process = 12 (irq16: uhci0)
> [ thread pid 12 tid 100098 ]
> Stopped at  uhci_interrupt+0x65:movzwl  %ax,%eax
> db> KDB: stack backtrace:
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> mi_switch() at mi_switch+0x27a
> turnstile_wait() at turnstile_wait+0x1cb
> _mtx_lock_sleep() at _mtx_lock_sleep+0xb0
> ukbd_poll() at ukbd_poll+0xbe
> kbdmux_poll() at kbdmux_poll+0x3f
> sc_cngetc() at sc_cngetc+0xec
> cncheckc() at cncheckc+0x4a
> cngetc() at cngetc+0x1c
> db_readline() at db_readline+0x77
> db_read_line() at db_read_line+0x15
> db_command_loop() at db_command_loop+0x38
> db_trap() at db_trap+0x89
> kdb_trap() at kdb_trap+0x101
> trap_fatal() at trap_fatal+0x29d
> trap() at trap+0x10a
> nmi_calltrap() at nmi_calltrap+0x8
> --- trap 0x13, rip = 0x80711dc5, rsp = 0x81272040, rbp = 
> 0xff907cf44b40 ---
> uhci_interrupt() at uhci_interrupt+0x65
> intr_event_execute_handlers() at intr_event_execute_handlers+0x104
> ithread_loop() at ithread_loop+0xa4
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xff907cf44d00, rbp = 0 ---
> db>
> 
> After disabling stopping on NMI (kdb_on_nmi), I still can't boot from 
> bge (this is a PXE booted machine), I get this in an infinite loop:
> bge1: link state changed to DOWN
> DHCP/BOOTP timeout for server 255.255.255.255
> bge1: 3 link states coalesced
> bge1: link state changed to UP
> bge0: 2 link states coalesced
> bge0: link state changed to DOWN
> bge0: link state changed to UP
> bge1: link state changed to DOWN
> bge0: link state changed to DOWN
> bge0: link state changed to UP
> bge0: link state changed to DOWN
> bge1: 2 link states coalesced
> bge1: link state changed to DOWN
> bge0: link state changed to UP
> bge0: link state changed to DOWN
> bge0: 2 link states coalesced
> bge0: link state changed to DOWN
> bge1: 2 link states coalesced
> bge1: link state changed to DOWN
> bge0: link state changed to UP
> bge0: link state changed to DOWN
> bge0: link state changed to UP
> bge0: link state changed to DOWN
> 
> Linux and Windows boot fine on the machine.
> 
> dmesg up to the point where it crashes:
> Copyright (c) 1992-2012 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 9.0-STABLE #3: Tue Feb 21 11:57:33 CET 2012
> r...@boot.lab:/usr/obj/usr/src/sys/BOOTCLNT amd64
> CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz (2693.57-MHz K8-class CPU)
>   Origin = "Genui