On Fri, Jan 7, 2011 at 2:36 PM, Mark Saad <nones...@longcount.org> wrote:
> On Fri, Jan 7, 2011 at 5:27 PM, Garrett Cooper <gcoo...@freebsd.org> wrote:
>> On Fri, Jan 7, 2011 at 2:26 PM, Garrett Cooper <gcoo...@freebsd.org> wrote:
>>> On Fri, Jan 7, 2011 at 2:22 PM, Mark Saad <nones...@longcount.org> wrote:
>>>> On Fri, Jan 7, 2011 at 4:56 PM, Garrett Cooper <gcoo...@freebsd.org> wrote:
>>>>> On Fri, Jan 7, 2011 at 1:20 PM, Mark Saad <nones...@longcount.org> wrote:
>>>>>> Hello hackers@,
>>>>>>  I have a good question that I cant find an answer for. I believe
>>>>>> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit
>>>>>> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page
>>>>>> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE
>>>>>> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 .
>>>>>>
>>>>>> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the
>>>>>> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this
>>>>>> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC
>>>>>> kernel using patches sources and tried to boot and I got the same
>>>>>> crash.
>>>>>>
>>>>>>  Next I rebuilt the kernel with KDB and DDB to see if I could get a
>>>>>> core-dump of the system. I also set loader.conf to
>>>>>>
>>>>>> kernel="kernel.DEBUG"
>>>>>> kern.dumpdev="/dev/da0s1b"
>>>>>>
>>>>>> Next I pxebooted  the box and the system does not crash on boot up, it
>>>>>> will easily load a nfs root and work fine. So I copied my debug
>>>>>> kernel, and loader.conf to the local disk and rebooted and it boots
>>>>>> fine from the local disk .
>>>>>>
>>>>>> Rebooting the server and running off the local disks and debug kernel,
>>>>>> I cant find any issues.
>>>>>>
>>>>>> Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes
>>>>>>
>>>>>> With this error
>>>>>>
>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>> cpuid = 0; apic id = 00
>>>>>> fault virtual address   = 0x0
>>>>>> fault code                 = supervisor write data, page not present
>>>>>> instruction pointer     = 0x8:0xffffffff800070fa
>>>>>> stack pointer            = 0x10:0xffffffff8153cbe0
>>>>>> frame pointer            = 0x10:0xffffffff8153cc50
>>>>>> code segment          = base 0x0, limit 0xfffff, type 0x1b
>>>>>>                              = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>> processor eflags      = interrupt enabled, resume, IOPL = 0
>>>>>> current process       = 0 (swapper)
>>>>>> [thread pid 0 tid 100000 ]
>>>>>> Stopped at      bzero+0xa:     repe stosq       %es:(%rdi)
>>>>>>
>>>>>>
>>>>>> What do I do , has anyone else seen anything like this ?
>>>>>
>>>>>    What are the messages before that on the kernel console and what
>>>>> are your drivers loaded on a stable system?
>>>>> Thanks,
>>>>> -Garrett
>>>>>
>>>> Garrett
>>>>  The last 4 lines of the verbose boot up of the generic kernel are
>>>> all from sio1
>>>
>>>    Is sio1 pointing to a generic UART, or is it something more
>>> special like the HP lights-out SOL interface?
>>>    Simple test might be to disable the sio/uart driver in the kernel
>>> and see if things worked.
>>
>> Or easier yet, disable the port in the BIOS and comment out all of the
>> sio/uart references in device.hints.
>>
> Garrett
>  Interesting commenting out the sio lines in device.hints fixes it,
> did device.hints
> or the sio driver change some how from 7.2 to 7.3 ?

    Given that the messages changed, it's probably a driver bug (my
guess is either isa or sio) that was introduced by accident where a
device is failing to probe and isn't properly releasing resources
and/or notifying that the driver couldn't attach at the kernel level.
re-CCing the list, but you might want to ask imp@ for some input; I
might have time to look at this further tonight, but I would check and
see where `prob failed tests(s):' is being printed out, and analyze
the code if possible for missing return codes, or put breakpoints in
ddb there so you can trace the call stack and figure out more info,
etc -- that will help point you at the culprit better.
Thanks,
-Garrett
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to