Hello,

I am currently trying to set up a newly built system with a Skylake 6700k CPU but am having an extremely reproducible kernel panic every time I connect a monitor to the display port connector of the Intel
integrated graphics chip.

This issue occurs either immediately upon connecting a display port monitor to the machine while it is up or late in the boot process if the display port is connected at boot time.

The monitor which I am using is a Dell U3415W ultra wide and the motherboard is a MSI Z170A Gaming M7.

I am not entirely surprised by the link train errors as there appear to be various posts about users having problems with this monitor and display port training, what surprises me most is the fact it is causing a kernel panic.

Upon the panic happening the kernel prints the following dump (to the second non DP monitor), (note this is hand copied as I have no way to dump the messages anywhere but the display so pardon any small typos).

[ 22.318630] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.365449] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.420272] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.475105] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.529931] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.584759] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.639588] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.649935] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 22.650532] [drm:intel_dp_start_link_traln [i915]] *ERROR* too many full retries, give up [ 24.329955] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[   25.345911]  Shutting down cpus with NMI
[   25.356092]  Kernel offset: disabled
[   25.356101]  Rebooting in 30 seconds.

If running kernel 4.2 occasionally these errors are followed by what seems to be a an mce machine check exception mentioning a corrupt processor context which is very hard to note down as it is only on the screen very briefly. However if running the latest kernel from https://github.com/torvalds/linux only the above error occurs, not the mce exception. I am pretty confident the mce exception is spurious due to this and the fact the system otherwise tests out fine.

I apologise if this report is a little sparse on details, it is very hard to post mortem debug the system due to the panic and
the fact I have no available serial terminal or hardware debugger.

Otherwise the system flawlessly passes memtest86+ and is completely stable even under heavy load. This issue seems to occur on every kernel I have tested so far including a stock ubuntu 15.4, a vanilla 4.0.5 kernel, a vanilla 4.2.0 kernel and the head of https://github.com/torvalds/linux as of a few hours ago.

The kernel config used for the kernel taken from git is available here: http://paste2.org/MH9vV4Le The 4.2 and 4.0.5 configs were extremely similar and only differ in the new entries made by oldconfig.

If there is anything I can do to produce more info I am more than happy to do so. Or if this is not the right mailing list for this issue please let me know where would be better.

Many thanks,
Matthew

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to