> So only one socket gets the machine check. So is there still a problem but 
> the fix will be different?
> I think the error inject creates a real machine check, but since each CPU has 
> its own memory controller,
> the machine check may only send to the CPU the error happens.

If there is a real machine check, then it must go to all logical cpus. If it 
doesn't get there, then there
is a h/w (or possibly f/w configuration) problem.  Interesting that few others 
have seen this. Perhaps
because it only shows up in a fatal path and the machine is crashing anyway.  A 
Google search for
the "Some CPUs didn't answer in synchronization" message does have a few hits 
that look relevant,
but following a few didn't give me enough details on machine configuration to 
tell whether they
match what you are seeing.

If there are many machines that do this - then we may need a workaround in 
Linux code for them.
Who is the manufacturer of the motherboard and/or system you are using?

But the current code that expects to see the machine check on all logical cpus 
is correct (and works
as is on other machines that are following the specification).

-Tony


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to