>> Maybe the other difference in approach between Andy and me is whether to
>> go for a solution that covers all the corner cases, or just make an
>> incremental
>> improvement that allows for recover in some useful subset of remaining fatal
>> cases, but still dies in other cases.
>
> Does that mean more core code surgery?
Yes. I need to look at other user access inside pagefault_disable/enable()
as likely spots where the code may continue after a machine check and
retry the access. So expect some more "if (ret == ENXIO) { do something to
give up gracefully }"
>> I'm happy to replace error messages with ones that are more descriptive and
>> helpful to humans.
>
> Yap, that: "Multiple copyin" with something more understandable to users...
I'll work on it. We tend not to have essay length messages as panic() strings.
But I can
add a comment in the code there so that people who grep whatever panic message
we choose can get more details on what happened and what to do.
-Tony