Dear @arch-general readers, I'm experiencing machine check exceptions since every kernel after package linux-3.11.5-1 (Oct 14 2013). I hope some nice people will be able to assist me or perhaps point me in a direction of something fruitful. First here goes some kernel panic output I was able to snap on May 25th (I also made it an attachment): > [19367.116180] Disabling lock debugging due to kernel taint > [19367.116196] mce: [Hardware Error]: CPU 1: Machine Check Exception: 5 Bank > 4: b200000000100402 > [19367.116202] mce: [Hardware Error]: RIP !INEXACT! 33:<00007f8b4934c8b7> > [19367.116205] mce: [Hardware Error]: TSC 2824672b8e7 > [19367.116211] mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 14010118857 > SOCKET 0 APIC 1 microcode 12 > [19367.116213] mce: [Hardware Error]: Run the above through 'mcelog --ascii' > [19367.116216] mce: [Hardware Error]: Some CPUs didn't answer in > synchronization > [19367.116218] mce: [Hardware Error]: Machine check: Invalid > [19367.116220] Kernel panic - not syncing: Fatal machine check on current CPU > [19368.211815] Shutting down cpus with NMI > [19368.222834] Kernel Offset: 0x0 from 0xffffffff81000000 0000 (relocation > range: 0xffffffff80000000-0xffffffff9fffffff) > [19368.222942] drm_kms_helper: panic occurred, switching back to text console > [19368.245774] Rebooting in 30 seconds > [19398.323579] ACPI RECOVERY or RESET_REG.
Assuming I did a complete system update about 5.4 hours earlier, this implies kernel output for linux kernel version 3.14.4. This has been the case for every kernel after version 3.11.5; from what I can trace in the `/var/cache/pacman/pkg' directory this also calls for _at least_ version 3.15.8; I've removed all other archived versions as they used valuable space. In dmesg output, my good version 3.11.5-1 is called: > Linux version 3.11.5-1-ARCH (tobias@T-POWA-LX) (gcc version 4.8.1 20130725 > (prerelease) (GCC)) #1 SMP PREEMPT Mon Oct 14 08:31:43 CEST 2013 The hardware of my system is perhaps relevant. I've got a Samsung NP900X3C-A01SE (Jun.2012) laptop: > Intel Core i5-3317U CPU @ 1.70GHz stepping 9 microcode 0x12 > 4GB RAM (3743M/3888M available, 2048M available to graphics) > SanDisk SSD U100 128GB, 10.01.04, max UDMA/133 > Intel Centrino Advanced-N 6235 AGN, REV=0xB0 Presuming this is a good starting point, if more specific hardware information is needed in this thread in the future I can add the output of e.g. lscpi -vvnn. --Rasmus -- Rasmus Liland, j...@jrl.dyndns.dk
[19367.116180] Disabling lock debugging due to kernel taint [19367.116196] mce: [Hardware Error]: CPU 1: Machine Check Exception: 5 Bank 4: b200000000100402 [19367.116202] mce: [Hardware Error]: RIP !INEXACT! 33:<00007f8b4934c8b7> [19367.116205] mce: [Hardware Error]: TSC 2824672b8e7 [19367.116211] mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 14010118857 SOCKET 0 APIC 1 microcode 12 [19367.116213] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [19367.116216] mce: [Hardware Error]: Some CPUs didn't answer in synchronization [19367.116218] mce: [Hardware Error]: Machine check: Invalid [19367.116220] Kernel panic - not syncing: Fatal machine check on current CPU [19368.211815] Shutting down cpus with NMI [19368.222834] Kernel Offset: 0x0 from 0xffffffff81000000 0000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) [19368.222942] drm_kms_helper: panic occurred, switching back to text console [19368.245774] Rebooting in 30 seconds [19398.323579] ACPI RECOVERY or RESET_REG. -- Linux version 3.11.5-1-ARCH (tobias@T-POWA-LX) (gcc version 4.8.1 20130725 (prerelease) (GCC)) #1 SMP PREEMPT Mon Oct 14 08:31:43 CEST 2013 -- Samsung NP900X3C-A01SE (Jun.2012) laptop: > Intel Core i5-3317U CPU @ 1.70GHz stepping 9 microcode 0x12 > 4GB RAM (3743M/3888M available, 2048M available to graphics) > SanDisk SSD U100 128GB, 10.01.04, max UDMA/133 > Intel Centrino Advanced-N 6235 AGN, REV=0xB0
pgpL6Zv4gz4wt.pgp
Description: PGP signature