Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-10 Thread Jack L.
Here's what happened when i did a ctrl+alt+del on the console.

Mar 10 03:10:23 jpr1 rc.shutdown: 30 second watchdog timeout expired.
Shutdown terminated.
Mar 10 03:10:23 jpr1 init: /bin/sh on /etc/rc.shutdown terminated
abnormally, going to single user mode
Mar 10 03:10:23 jpr1 syslogd: exiting on signal 15
Mar 10 03:13:26 jpr1 syslogd: kernel boot file is /boot/kernel/kernel
Mar 10 03:13:26 jpr1 kernel: Waiting (max 60 seconds) for system
process `vnlru' to stop...done
Mar 10 03:13:26 jpr1 kernel: Waiting (max 60 seconds) for system
process `bufdaemon' to stop...timed out
Mar 10 03:13:26 jpr1 kernel: Waiting (max 60 seconds) for system
process `syncer' to stop...
Mar 10 03:13:26 jpr1 kernel: Syncing disks, vnodes remaining...0 timed out
Mar 10 03:13:26 jpr1 kernel: All buffers synced.


On Thu, Mar 10, 2011 at 1:50 AM, Jack L.  wrote:
> I just got the error just now. I'm keeping the machine booted for a
> little longer in case someone wants to ask any questions. If I simple
> reboot the machine, it will just hang forever and eventually need to
> power cycle it.
>
> Mar  9 04:58:51 jpr1 kernel:
> j...@jpr1.prdhost.com:/usr/obj/usr/src/sys/JPR1 amd64
> Mar  9 04:58:51 jpr1 kernel: Timecounter "i8254" frequency 1193182 Hz quality > 0
> Mar  9 04:58:51 jpr1 kernel: CPU: AMD Athlon(tm) 64 X2 Dual Core
> Processor 6400+ (3214.84-MHz K8-class CPU)
> Mar  9 04:58:51 jpr1 kernel: Origin = "AuthenticAMD"  Id = 0x40f33
> Family = f  Model = 43  Stepping = 3
> Mar  9 04:58:51 jpr1 kernel:
> Features=0x178bfbff
> Mar  9 04:58:51 jpr1 kernel: Features2=0x2001
> Mar  9 04:58:51 jpr1 kernel: AMD
> Features=0xea500800
> Mar  9 04:58:51 jpr1 kernel: AMD Features2=0x1f
> Mar  9 04:58:51 jpr1 kernel: real memory  = 8589934592 (8192 MB)
> Mar  9 04:58:51 jpr1 kernel: avail memory = 8228098048 (7846 MB)
> Mar  9 04:58:51 jpr1 kernel: ACPI APIC Table: 
> Mar  9 04:58:51 jpr1 kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 
> CPUs
> Mar  9 04:58:51 jpr1 kernel: FreeBSD/SMP: 1 package(s) x 2 core(s)
> Mar  9 04:58:51 jpr1 kernel: cpu0 (BSP): APIC ID:  0
> Mar  9 04:58:51 jpr1 kernel: cpu1 (AP): APIC ID:  1
> Mar  9 04:58:51 jpr1 kernel: ioapic0: Changing APIC ID to 4
> Mar  9 04:58:51 jpr1 kernel: ioapic0  irqs 0-23 on motherboard
> Mar  9 04:58:51 jpr1 kernel: kbd1 at kbdmux0
> Mar  9 04:58:51 jpr1 kernel: acpi0:  on motherboard
> Mar  9 04:58:51 jpr1 kernel: acpi0: [ITHREAD]
> Mar  9 04:58:51 jpr1 kernel: acpi0: Power Button (fixed)
> Mar  9 04:58:51 jpr1 kernel: acpi0: reservation of 0, a (3) failed
> Mar  9 04:58:51 jpr1 kernel: acpi0: reservation of 10, cedf (3) failed
> Mar  9 04:58:51 jpr1 kernel: Timecounter "ACPI-fast" frequency 3579545
> Hz quality 1000
> Mar  9 04:58:51 jpr1 kernel: acpi_timer0: <24-bit timer at
> 3.579545MHz> port 0x4008-0x400b on acpi0
> Mar  9 04:58:51 jpr1 kernel: cpu0:  on acpi0
> Mar  9 04:58:51 jpr1 kernel: cpu1:  on acpi0
> Mar  9 04:58:51 jpr1 kernel: acpi_button0:  on acpi0
> Mar  9 04:58:51 jpr1 kernel: pcib0:  port
> 0xcf8-0xcff on acpi0
> Mar  9 04:58:51 jpr1 kernel: pci0:  on pcib0
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.0 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.1 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.2 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.3 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.4 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.5 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.6 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.7 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: vgapci0:  mem
> 0xfc00-0xfcff,0xd000-0xdfff,0xfb00-0xfbff irq
> 16 at device 5.0 on pci0
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 9.0 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: isab0:  at device 10.0 on pci0
> Mar  9 04:58:51 jpr1 kernel: isa0:  on isab0
> Mar  9 04:58:51 jpr1 kernel: nfsmb0:  Controller> port 0x4c00-0x4c3f,0x4c40-0x4c7f at device 10.1 on pci0
> Mar  9 04:58:51 jpr1 kernel: smbus0:  on nfsmb0
> Mar  9 04:58:51 jpr1 kernel: smb0:  on smbus0
> Mar  9 04:58:51 jpr1 kernel: nfsmb1:  Controller> on nfsmb0
> Mar  9 04:58:51 jpr1 kernel: smbus1:  on nfsmb1
> Mar  9 04:58:51 jpr1 kernel: smb1:  on smbus1
> Mar  9 04:58:51 jpr1 kernel: pci0:  at device 10.2 (no
> driver attached)
> Mar  9 04:58:51 jpr1 kernel: ohci0: 
> mem 0xfe02f000-0xfe02 at device 11.0 on pci0
> Mar  9 04:58:51 jpr1 kernel: ohci0: [ITHREAD]
> Mar  9 04:58:51 jpr1 kernel: usbus0:  on ohci0
> Mar  9 04:58:51 jpr1 kernel: ehci0:  controller> mem 0xfe02e000-0xfe02e0ff at device 11.1 on pci0
> Mar  9 04:58:51 jpr1 kernel: ehci0: [ITHREAD]
> Mar  9 04:58:51 jpr1 kernel: usbus1: EHCI version 1.0
> Mar  9 04:58:51 jpr1 kernel: usbus1:  controller> on ehci0
> Mar  9 04:58:51 jpr1 kernel: atapci0:  controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfd00-0

Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-10 Thread Jack L.
I just got the error just now. I'm keeping the machine booted for a
little longer in case someone wants to ask any questions. If I simple
reboot the machine, it will just hang forever and eventually need to
power cycle it.

Mar  9 04:58:51 jpr1 kernel:
j...@jpr1.prdhost.com:/usr/obj/usr/src/sys/JPR1 amd64
Mar  9 04:58:51 jpr1 kernel: Timecounter "i8254" frequency 1193182 Hz quality 0
Mar  9 04:58:51 jpr1 kernel: CPU: AMD Athlon(tm) 64 X2 Dual Core
Processor 6400+ (3214.84-MHz K8-class CPU)
Mar  9 04:58:51 jpr1 kernel: Origin = "AuthenticAMD"  Id = 0x40f33
Family = f  Model = 43  Stepping = 3
Mar  9 04:58:51 jpr1 kernel:
Features=0x178bfbff
Mar  9 04:58:51 jpr1 kernel: Features2=0x2001
Mar  9 04:58:51 jpr1 kernel: AMD
Features=0xea500800
Mar  9 04:58:51 jpr1 kernel: AMD Features2=0x1f
Mar  9 04:58:51 jpr1 kernel: real memory  = 8589934592 (8192 MB)
Mar  9 04:58:51 jpr1 kernel: avail memory = 8228098048 (7846 MB)
Mar  9 04:58:51 jpr1 kernel: ACPI APIC Table: 
Mar  9 04:58:51 jpr1 kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
Mar  9 04:58:51 jpr1 kernel: FreeBSD/SMP: 1 package(s) x 2 core(s)
Mar  9 04:58:51 jpr1 kernel: cpu0 (BSP): APIC ID:  0
Mar  9 04:58:51 jpr1 kernel: cpu1 (AP): APIC ID:  1
Mar  9 04:58:51 jpr1 kernel: ioapic0: Changing APIC ID to 4
Mar  9 04:58:51 jpr1 kernel: ioapic0  irqs 0-23 on motherboard
Mar  9 04:58:51 jpr1 kernel: kbd1 at kbdmux0
Mar  9 04:58:51 jpr1 kernel: acpi0:  on motherboard
Mar  9 04:58:51 jpr1 kernel: acpi0: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: acpi0: Power Button (fixed)
Mar  9 04:58:51 jpr1 kernel: acpi0: reservation of 0, a (3) failed
Mar  9 04:58:51 jpr1 kernel: acpi0: reservation of 10, cedf (3) failed
Mar  9 04:58:51 jpr1 kernel: Timecounter "ACPI-fast" frequency 3579545
Hz quality 1000
Mar  9 04:58:51 jpr1 kernel: acpi_timer0: <24-bit timer at
3.579545MHz> port 0x4008-0x400b on acpi0
Mar  9 04:58:51 jpr1 kernel: cpu0:  on acpi0
Mar  9 04:58:51 jpr1 kernel: cpu1:  on acpi0
Mar  9 04:58:51 jpr1 kernel: acpi_button0:  on acpi0
Mar  9 04:58:51 jpr1 kernel: pcib0:  port
0xcf8-0xcff on acpi0
Mar  9 04:58:51 jpr1 kernel: pci0:  on pcib0
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.0 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.1 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.2 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.3 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.4 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.5 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.6 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 0.7 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: vgapci0:  mem
0xfc00-0xfcff,0xd000-0xdfff,0xfb00-0xfbff irq
16 at device 5.0 on pci0
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 9.0 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: isab0:  at device 10.0 on pci0
Mar  9 04:58:51 jpr1 kernel: isa0:  on isab0
Mar  9 04:58:51 jpr1 kernel: nfsmb0:  port 0x4c00-0x4c3f,0x4c40-0x4c7f at device 10.1 on pci0
Mar  9 04:58:51 jpr1 kernel: smbus0:  on nfsmb0
Mar  9 04:58:51 jpr1 kernel: smb0:  on smbus0
Mar  9 04:58:51 jpr1 kernel: nfsmb1:  on nfsmb0
Mar  9 04:58:51 jpr1 kernel: smbus1:  on nfsmb1
Mar  9 04:58:51 jpr1 kernel: smb1:  on smbus1
Mar  9 04:58:51 jpr1 kernel: pci0:  at device 10.2 (no
driver attached)
Mar  9 04:58:51 jpr1 kernel: ohci0: 
mem 0xfe02f000-0xfe02 at device 11.0 on pci0
Mar  9 04:58:51 jpr1 kernel: ohci0: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: usbus0:  on ohci0
Mar  9 04:58:51 jpr1 kernel: ehci0:  mem 0xfe02e000-0xfe02e0ff at device 11.1 on pci0
Mar  9 04:58:51 jpr1 kernel: ehci0: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: usbus1: EHCI version 1.0
Mar  9 04:58:51 jpr1 kernel: usbus1:  on ehci0
Mar  9 04:58:51 jpr1 kernel: atapci0:  port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfd00-0xfd0f at
device 13.0 on pci0
Mar  9 04:58:51 jpr1 kernel: ata0:  on atapci0
Mar  9 04:58:51 jpr1 kernel: ata0: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: ata1:  on atapci0
Mar  9 04:58:51 jpr1 kernel: ata1: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: atapci1:  port
0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xf800-0xf80f mem
0xfe02d000-0xfe02dfff irq 20 at device 14.0 on pci0
Mar  9 04:58:51 jpr1 kernel: atapci1: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: ata2:  on atapci1
Mar  9 04:58:51 jpr1 kernel: ata2: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: ata3:  on atapci1
Mar  9 04:58:51 jpr1 kernel: ata3: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: atapci2:  port
0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xf300-0xf30f mem
0xfe02c000-0xfe02cfff irq 21 at device 15.0 on pci0
Mar  9 04:58:51 jpr1 kernel: atapci2: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: ata4:  on atapci2
Mar  9 04:58:51 jpr1 kernel: ata4: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: ata5:  on atapci2
Mar  9 04:58:51 jpr1 kernel: ata5: [ITHREAD]
Mar  9 04:58:51 jpr1 kernel: pcib1:  at device 16.0 on pci0
Mar  9 04:58:51 jp

Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-03 Thread Yanhui Shen
It works fine.
When I grep the dmesg, I can find this message.


-- 
Best regards,
Yanhui
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-03 Thread John Baldwin
On Wednesday, March 02, 2011 7:39:56 pm Yanhui Shen wrote:
> `CPU0:  local APIC error 0x40"
> 
> I get this error on my ThinkPad R400(Intel Core2 T6570).

Do you get a hang or does the machine keep working fine?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-03 Thread Nikolay Denev
On 3 Mar, 2011, at 02:39 , Yanhui Shen wrote:

> `CPU0:  local APIC error 0x40"
> 
> I get this error on my ThinkPad R400(Intel Core2 T6570).
> 
> 
> -- 
> Best regards,
> Yanhui
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

I saw this error too a few days ago, it was wen I plugged a USB keyboard to
a HP EX470 machine.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread Yanhui Shen
`CPU0:  local APIC error 0x40"

I get this error on my ThinkPad R400(Intel Core2 T6570).


-- 
Best regards,
Yanhui
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread Jack L.
On Tue, Mar 1, 2011 at 5:50 PM, Mike Tancsa  wrote:
> I had a machine deadlock just now and the only thing on the serial
> console was
>
> CPU0: local APIC error 0x40
> CPU1: local APIC error 0x40
>
> prior to it hanging.  Anyone know what that error is ? Googling didnt
> really show much definitive.  Someone suggested bad hardware ? Is there
> a way to narrow that down ?

I too get this error all the time on my AMD 6400+ Dual Core with an
ASUS motherboard. I'll post some feedback when it happens again.
Problem is everytime it happens, the system completely locks up and a
power cycle is needed.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread Mike Tancsa
On 3/2/2011 10:18 AM, John Baldwin wrote:
>>
>> Hi,
>>  No, nothing at all.  I checked the logs again and nothing unusual
>> leading up to it, nor was anything recorded on the serial console other
>> than that error.  Do you think its just a hardware issue?
> 
> No, was trying to think if there was a scenario where an I/O APIC pin or MSI
> message could specify an illegal vector.
> 
> Can you reproduce this at all?

Not sure. Its the first time I have ever seen this error.  In the past,
the box would be crashing for other reasons after 2-5 days.  However,
with the fixes from glebius and mlaier all seemed to have been fixed.
The box was up 11 days when it hung with that error. I could not even
break into the debugger. Hence, I was thinking perhaps the hardware is
all of a sudden showing an issue.
The two active NICs are using the legacy interrupts
interrupt  total   rate
irq1: atkbd0   3  0
irq4: uart029974  0
irq6: fdc0 5  0
irq14: ata0   125592  2
irq15: ata1   48  0
irq24: em0  93380742   1529
irq25: em1  96506206   1580
cpu0: timer117681138   1927
cpu1: timer117682130   1927
Total  425405838   6967

---Mike

> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread John Baldwin
On Wednesday, March 02, 2011 8:07:59 am Mike Tancsa wrote:
> On 3/2/2011 7:55 AM, John Baldwin wrote:
> > 
> > Hmm, the interrupt pins on the each lapic look fine (they all either have a 
> > legal vector, are using NMI delivery, or are masked).
> > 
> > All of the places that send IPIs have the interrupt vectors hard-coded as 
> > constant values in the code.
> > 
> > Unfortunately there is no register that tells us which illegal vector was 
> > posted.
> > 
> > Were you doing anything related to changing the state of device interrupts 
> > (cpuset -x, kldload, kldunload, etc.) when this happened?
> 
> Hi,
>   No, nothing at all.  I checked the logs again and nothing unusual
> leading up to it, nor was anything recorded on the serial console other
> than that error.  Do you think its just a hardware issue?

No, was trying to think if there was a scenario where an I/O APIC pin or MSI
message could specify an illegal vector.

Can you reproduce this at all?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread Mike Tancsa
On 3/2/2011 7:55 AM, John Baldwin wrote:
> 
> Hmm, the interrupt pins on the each lapic look fine (they all either have a 
> legal vector, are using NMI delivery, or are masked).
> 
> All of the places that send IPIs have the interrupt vectors hard-coded as 
> constant values in the code.
> 
> Unfortunately there is no register that tells us which illegal vector was 
> posted.
> 
> Were you doing anything related to changing the state of device interrupts 
> (cpuset -x, kldload, kldunload, etc.) when this happened?

Hi,
No, nothing at all.  I checked the logs again and nothing unusual
leading up to it, nor was anything recorded on the serial console other
than that error.  Do you think its just a hardware issue?

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-02 Thread John Baldwin
On Tuesday, March 01, 2011 9:32:58 pm Mike Tancsa wrote:
> On 3/1/2011 9:04 PM, Jeremy Chadwick wrote:
> > On Tue, Mar 01, 2011 at 08:50:17PM -0500, Mike Tancsa wrote:
> >> I had a machine deadlock just now and the only thing on the serial
> >> console was
> >>
> >> CPU0: local APIC error 0x40
> >> CPU1: local APIC error 0x40
> > 
> > The error in question I'm not familiar with, but the code in
> > src/sys/x86/x86/local_apic.c indicates that 0x40 is the contents of the
> > LAPIC ESR (error status register).
> > 
> > Please provide full output from a verbose boot.
> 
> 
> Attached as a .txt file

Hmm, the interrupt pins on the each lapic look fine (they all either have a 
legal vector, are using NMI delivery, or are masked).

All of the places that send IPIs have the interrupt vectors hard-coded as 
constant values in the code.

Unfortunately there is no register that tells us which illegal vector was 
posted.

Were you doing anything related to changing the state of device interrupts 
(cpuset -x, kldload, kldunload, etc.) when this happened?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-01 Thread Jeremy Chadwick
On Tue, Mar 01, 2011 at 09:32:58PM -0500, Mike Tancsa wrote:
> On 3/1/2011 9:04 PM, Jeremy Chadwick wrote:
> > On Tue, Mar 01, 2011 at 08:50:17PM -0500, Mike Tancsa wrote:
> >> I had a machine deadlock just now and the only thing on the serial
> >> console was
> >>
> >> CPU0: local APIC error 0x40
> >> CPU1: local APIC error 0x40
> > 
> > The error in question I'm not familiar with, but the code in
> > src/sys/x86/x86/local_apic.c indicates that 0x40 is the contents of the
> > LAPIC ESR (error status register).
> > 
> > Please provide full output from a verbose boot.
> 
> Attached as a .txt file

Thanks -- this will probably be helpful to other folks, not so much me.
:-)  I lack familiarity with I/O and local APIC configuration.

The error strings in question aren't shown in the attached text file,
strangely enough.  Maybe only visible on VGA console?

Based on what I can find in Intel specifications, bit 6 (0x40) of the
ESR is defined as:

Bit 6: Receive Illegal Vector
Set when the local APIC detects an illegal vector (one in the range 0 to
15) in an interrupt message it receives or in an interrupt generated
locally from the local vector table or via a self IPI. Such interrupts
are not be delivered to the processor; the local APIC will never set an
IRR bit in the range 0 to 15.

I got this from Section 10.5.3 of Intel's IA-32 Intel Architecture
Software Developer's Manual, Volume 3A:

http://developer.intel.com/design/processor/manuals/253668.pdf

The motherboard looks like a Supermicro X7SBA or something along those
lines (I can tell from the ACPI string).  A workaround might be to
disable multiprocessor support in the BIOS (specifically Advanced ->
Advanced Processor Options -> Core-Multi-Processing = Disabled).  If
this does work, note that I agree it's not an acceptable permanent
solution.

CC'ing John who might have some ideas about the LAPIC stuff.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-01 Thread Mike Tancsa
On 3/1/2011 9:04 PM, Jeremy Chadwick wrote:
> On Tue, Mar 01, 2011 at 08:50:17PM -0500, Mike Tancsa wrote:
>> I had a machine deadlock just now and the only thing on the serial
>> console was
>>
>> CPU0: local APIC error 0x40
>> CPU1: local APIC error 0x40
> 
> The error in question I'm not familiar with, but the code in
> src/sys/x86/x86/local_apic.c indicates that 0x40 is the contents of the
> LAPIC ESR (error status register).
> 
> Please provide full output from a verbose boot.


Attached as a .txt file

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0af.
Preloaded elf module "/boot/kernel/if_disc.ko" at 0xc0af01b0.
Preloaded elf module "/boot/kernel/coretemp.ko" at 0xc0af025c.
Timecounter "i8254" frequency 1193182 Hz quality 0
Calibrating TSC clock ... TSC clock: 2128015344 Hz
CPU: Intel(R) Xeon(R) CPU3050  @ 2.13GHz (2128.02-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x6f6  Family = 6  Model = f  Stepping = 6
  
Features=0xbfebfbff
  Features2=0xe3bd
  AMD Features=0x2010
  AMD Features2=0x1
  TSC: P-state invariant

Instruction TLB: 4 KB Pages, 4-way set associative, 128 entries
2nd-level cache: 2-MB, 8-way set associative, 64-byte line size
1st-level instruction cache: 32 KB, 8-way set associative, 64 byte line size
1st-level data cache: 32 KB, 8-way set associative, 64 byte line size
L2 cache: 2048 kbytes, 8-way associative, 64 bytes/line
real memory  = 4294967296 (4096 MB)
Physical memory chunk(s):
0x1000 - 0x0009bfff, 634880 bytes (155 pages)
0x0010 - 0x003f, 3145728 bytes (768 pages)
0x00c26000 - 0xdbf7, 3677724672 bytes (897882 pages)
avail memory = 3676545024 (3506 MB)
Table 'FACP' at 0xdfee8e51
Table 'MCFG' at 0xdfee8ec5
Table 'APIC' at 0xdfee8f01
APIC: Found table at 0xdfee8f01
MP Configuration Table version 1.4 found at 0xc009d5a1
APIC: Using the MADT enumerator.
MADT: Found CPU APIC ID 0 ACPI ID 0: enabled
SMP: Added CPU 0 (AP)
MADT: Found CPU APIC ID 1 ACPI ID 1: enabled
SMP: Added CPU 1 (AP)
ACPI APIC Table: 
INTR: Adding local APIC 1 as a target
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
bios32: Found BIOS32 Service Directory header at 0xc00f5e90
bios32: Entry = 0xfd470 (c00fd470)  Rev = 0  Len = 1
pcibios: PCI BIOS entry at 0xfd470+0x29e
pnpbios: Found PnP BIOS data at 0xc00f5ee0
pnpbios: Entry = f:b19b  Rev = 1.0
Other BIOS signatures found:
x86bios:   IVT 0x00-0x0004ff at 0xc000
x86bios:  SSEG 0x01-0x01 at 0xc6988000
x86bios:  EBDA 0x09d000-0x09 at 0xc009d000
x86bios:   ROM 0x0a-0x0e at 0xc00a
APIC: CPU 0 has ACPI ID 0
APIC: CPU 1 has ACPI ID 1
ULE: setup cpu 0
ULE: setup cpu 1
ACPI: RSDP 0xf5e40 00014 (v00 PTLTD )
ACPI: RSDT 0xdfee25c4 0003C (v01 PTLTDRSDT   0604  LTP )
ACPI: FACP 0xdfee8e51 00074 (v01 INTEL   0604 PTL  0003)
ACPI: DSDT 0xdfee39ec 05465 (v01  INTEL GLENWOOD 0604 MSFT 010E)
ACPI: FACS 0xdfee9fc0 00040
ACPI: MCFG 0xdfee8ec5 0003C (v01 PTLTDMCFG   0604  LTP )
ACPI: APIC 0xdfee8f01 00074 (v01 PTLTD  ? APIC   0604  LTP )
ACPI: BOOT 0xdfee8f75 00028 (v01 PTLTD  $SBFTBL$ 0604  LTP 0001)
ACPI: ASF! 0xdfee8f9d 00063 (v32   CETP CETP 0604 PTL  0001)
ACPI: SSDT 0xdfee2600 013EC (v01  PmRefCpuPm 3000 INTL 20050228)
MADT: Found IO APIC ID 2, Interrupt 0 at 0xfec0
ioapic0: Routing external 8259A's -> intpin 0
MADT: Found IO APIC ID 3, Interrupt 24 at 0xfec1
MADT: Interrupt override: source 0, irq 2
ioapic0: Routing IRQ 0 -> intpin 2
MADT: Interrupt override: source 9, irq 9
ioapic0: intpin 9 trigger: level
lapic0: Routing NMI -> LINT1
lapic0: LINT1 trigger: edge
lapic0: LINT1 polarity: high
lapic1: Routing NMI -> LINT1
lapic1: LINT1 trigger: edge
lapic1: LINT1 polarity: high
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-47 on motherboard
cpu0 BSP:
 ID: 0x   VER: 0x00050014 LDR: 0x DFR: 0x
  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
  timer: 0x000100ef therm: 0x0200 err: 0x00f0 pmc: 0x00010400
nfslock: pseudo-device
null: 
random: 
io: 
kbd: new array size 4
kbd1 at kbdmux0
mem: 
Pentium Pro MTRR support enabled
acpi0:  on motherboard
PCIe: Memory Mapped configuration base @ 0xf000
pcibios: BIOS version 2.10
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 0 vector 48
acpi0: [MPSAFE]
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: wakeup code va 0xc6982000 pa 0x1000
ACPI timer: 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 -> 10
Timecou

Re: CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-01 Thread Jeremy Chadwick
On Tue, Mar 01, 2011 at 08:50:17PM -0500, Mike Tancsa wrote:
> I had a machine deadlock just now and the only thing on the serial
> console was
> 
> CPU0: local APIC error 0x40
> CPU1: local APIC error 0x40
> 
> prior to it hanging.  Anyone know what that error is ? Googling didnt
> really show much definitive.  Someone suggested bad hardware ? Is there
> a way to narrow that down ?

The error in question I'm not familiar with, but the code in
src/sys/x86/x86/local_apic.c indicates that 0x40 is the contents of the
LAPIC ESR (error status register).

Please provide full output from a verbose boot.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


CPU0: local APIC error 0x40 CPU1: local APIC error 0x40

2011-03-01 Thread Mike Tancsa
I had a machine deadlock just now and the only thing on the serial
console was

CPU0: local APIC error 0x40
CPU1: local APIC error 0x40

prior to it hanging.  Anyone know what that error is ? Googling didnt
really show much definitive.  Someone suggested bad hardware ? Is there
a way to narrow that down ?



Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU3050  @ 2.13GHz (2128.01-MHz
686-class CPU)
  Origin = "GenuineIntel"  Id = 0x6f6  Family = 6  Model = f  Stepping = 6

Features=0xbfebfbff
  Features2=0xe3bd
  AMD Features=0x2010
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 3676545024 (3506 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-47 on motherboard
kbd1 at kbdmux0
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
cpu0:  on acpi0
cpu1:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 1.0 on pci0
pci1:  on pcib1
pcib2:  irq 17 at device 28.0 on pci0
pci9:  on pcib2
pcib3:  at device 0.0 on pci9
pci10:  on pcib3
em0:  port
0x4000-0x403f mem 0xe024-0xe025,0xe020-0xe023 irq 24 at
device 1.0 on pci10
em0: [FILTER]
em0: Ethernet address: 00:1b:21:08:32:a8
em1:  port
0x4040-0x407f mem 0xe026-0xe027,0xe028-0xe02b irq 25 at
device 1.1 on pci10
em1: [FILTER]
em1: Ethernet address: 00:1b:21:08:32:a9
pcib4:  irq 17 at device 28.4 on pci0
pci13:  on pcib4


---Mike
-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"