> Please see below for an updated patch. Yes. That worked:
[ 78.946069] mce: mce_timed_out: MCE holdout CPUs (may include false positives): 24-47,120-143 [ 78.946151] mce: mce_timed_out: MCE holdout CPUs (may include false positives): 24-47,120-143 [ 78.946153] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler I guess that more than one CPU hit the timeout and so your new message was printed twice before the panic code took over? Once again, the whole of socket 1 is MIA rather than just the pair of threads on one of the cores there. But that's a useful improvement (eliminating the other three sockets on this system). Tested-by: Tony Luck <tony.l...@intel.com> -Tony