Good day.
Today we've god very unfunny bug.
Server with multibit ECC memory catch recoverable error. It was logged
to IMPI SEL (system event log) on hardware and sent to OS on that host.
Usually those messages simply prints event to dmesg and admins lazily
evacuate host and replace memory (or even ignore single error). Those
errors are non-fatal.
But we've got panic and trace in
arch/x86/kernel/cpu/mcheck/mce_dom0.c:39 convert_log with following
reboot instead of harmless dmesg message.
Trace is below, but my question is: where I should send bugreports on
citrix kernel?
Aug 21 18:07:14 10.1.3.44 [ 7162.416812] ------------[ cut here
]------------
Aug 21 18:07:14 10.1.3.44 [ 7162.420590] WARNING: at
arch/x86/kernel/cpu/mcheck/mce_dom0.c:39 convert_log+0x199/0x1b0()
Aug 21 18:07:14 10.1.3.44 [ 7162.421332] Hardware name: X9DRT-HF+
(modules skip)
Aug 21 18:07:14 10.1.3.44 [last unloaded: microcode]
Aug 21 18:07:14 10.1.3.44
Aug 21 18:07:14 10.1.3.44 [ 7162.520280] Pid: 0, comm: swapper Not
tainted 2.6.32.43-0.4.1.xs1.6.10.741.170752xen #1
Aug 21 18:07:14 10.1.3.44 [ 7162.521387] Call Trace:
Aug 21 18:07:14 10.1.3.44 [ 7162.522144] [<c0110169>] ?
convert_log+0x199/0x1b0
Aug 21 18:07:14 10.1.3.44 [ 7162.522882] [<c01343f1>]
warn_slowpath_common+0x81/0xa0
Aug 21 18:07:14 10.1.3.44 [ 7162.524701] [<c0110169>] ?
convert_log+0x199/0x1b0
Aug 21 18:07:14 10.1.3.44 [ 7162.525433] [<c013442a>]
warn_slowpath_null+0x1a/0x20
Aug 21 18:07:14 10.1.3.44 [ 7162.525813] [<c0110169>]
convert_log+0x199/0x1b0
Aug 21 18:07:14 10.1.3.44 [ 7162.526177] [<c0110223>]
mce_dom0_interrupt+0xa3/0x120
Aug 21 18:07:14 10.1.3.44 [ 7162.526211] [<c016a7c5>]
handle_IRQ_event+0x55/0x180
Aug 21 18:07:14 10.1.3.44 [ 7162.526592] [<c016a7c5>] ?
handle_IRQ_event+0x55/0x180
Aug 21 18:07:14 10.1.3.44 [ 7162.528148] [<c016cc4a>]
handle_level_irq+0x8a/0x130
Aug 21 18:07:14 10.1.3.44 [ 7162.528547] [<c0105ec9>] handle_irq+0x39/0x60
Aug 21 18:07:14 10.1.3.44 [ 7162.528939] [<c03d9645>]
evtchn_do_upcall+0x135/0x326
Aug 21 18:07:14 10.1.3.44 [ 7162.529671] [<c03d2ed5>] ?
schedule+0x375/0xae0
Aug 21 18:07:14 10.1.3.44 [ 7162.529703] [<c010477f>]
hypervisor_callback+0x43/0x4b
Aug 21 18:07:14 10.1.3.44 [ 7162.532047] [<c0106b05>] ?
xen_safe_halt+0xb5/0x150
Aug 21 18:07:14 10.1.3.44 [ 7162.532840] [<c010a6ce>] xen_idle+0x2e/0x80
Aug 21 18:07:14 10.1.3.44 [ 7162.533231] [<c0102acf>] cpu_idle+0x3f/0x70
Aug 21 18:07:14 10.1.3.44 [ 7162.533985] [<c03c29d2>] rest_init+0x62/0x70
Aug 21 18:07:14 10.1.3.44 [ 7162.535448] [<c056bd05>]
start_kernel+0x2a5/0x340
Aug 21 18:07:14 10.1.3.44 [ 7162.536189] [<c056b5f0>] ?
unknown_bootoption+0x0/0x1f0
Aug 21 18:07:14 10.1.3.44 [ 7162.536914] [<c056b07c>]
i386_start_kernel+0x7c/0x90
Aug 21 18:07:14 10.1.3.44 [ 7162.537654] ---[ end trace 76553ff173258821
]---
_______________________________________________
Xen-api mailing list
[email protected]
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api