RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> So that "system hang or panic" which the validation folks triggered, > that cannot be reproduced anymore? Did they run the latest version of > the patch? I will get the validation folks to run the latest version (and play around with hyperthreading if they see problems). -Tony

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Borislav Petkov
On Tue, Feb 02, 2021 at 04:04:17PM +, Luck, Tony wrote: > > And the much more important question is, what is the code supposed to > > do when that overflow *actually* happens in real life? Because IINM, > > an overflow condition on the same page would mean killing the task to > > contain the

RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> And the much more important question is, what is the code supposed to > do when that overflow *actually* happens in real life? Because IINM, > an overflow condition on the same page would mean killing the task to > contain the error and not killing the machine... Correct. The cases I've

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Borislav Petkov
On Mon, Feb 01, 2021 at 10:58:12AM -0800, Luck, Tony wrote: > On Thu, Jan 28, 2021 at 06:57:35PM +0100, Borislav Petkov wrote: > > Crazy idea: if you still can reproduce on -rc3, you could bisect: i.e., > > if you apply the patch on -rc3 and it explodes and if you apply the same > > patch on -rc5

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-01 Thread Luck, Tony
On Thu, Jan 28, 2021 at 06:57:35PM +0100, Borislav Petkov wrote: > Crazy idea: if you still can reproduce on -rc3, you could bisect: i.e., > if you apply the patch on -rc3 and it explodes and if you apply the same > patch on -rc5 and it works, then that could be a start... Yeah, don't > have a

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-28 Thread Borislav Petkov
On Tue, Jan 26, 2021 at 02:36:05PM -0800, Luck, Tony wrote: > In some cases Linux might context switch to something else. Perhaps > this task even gets picked up by another CPU to run the task work > queued functions. But I imagine that the context switch should act > as a barrier ... shouldn't

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-27 Thread Luck, Tony
On Tue, Jan 26, 2021 at 12:03:14PM +0100, Borislav Petkov wrote: > On Mon, Jan 25, 2021 at 02:55:09PM -0800, Luck, Tony wrote: > > And now I've changed it back to non-atomic (but keeping the > > slightly cleaner looking code style that I used for the atomic > > version). This one also works for

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-26 Thread Borislav Petkov
On Mon, Jan 25, 2021 at 02:55:09PM -0800, Luck, Tony wrote: > And now I've changed it back to non-atomic (but keeping the > slightly cleaner looking code style that I used for the atomic > version). This one also works for thousands of injections and > recoveries. Maybe take it now before it

[PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-25 Thread Luck, Tony
On Thu, Jan 21, 2021 at 01:09:59PM -0800, Luck, Tony wrote: > On Wed, Jan 20, 2021 at 01:18:12PM +0100, Borislav Petkov wrote: > But, on a whim, I changed the type of mce_count from "int" to "atomic_t" and > fixeed up the increment & clear to use atomic_inc_return() and atomic_set(). > See updated