Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-14 Thread Luck, Tony
On Thu, Jan 14, 2021 at 09:22:13PM +0100, Borislav Petkov wrote: > On Mon, Jan 11, 2021 at 01:44:50PM -0800, Tony Luck wrote: > > @@ -1431,8 +1433,11 @@ noinstr void do_machine_check(struct pt_regs *regs) > > mce_panic("Failed kernel mode recovery", &m, > > msg); > >

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-14 Thread Borislav Petkov
On Mon, Jan 11, 2021 at 01:44:50PM -0800, Tony Luck wrote: > @@ -1431,8 +1433,11 @@ noinstr void do_machine_check(struct pt_regs *regs) > mce_panic("Failed kernel mode recovery", &m, > msg); > } > > - if (m.kflags & MCE_IN_KERNEL_COPYIN) >

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Borislav Petkov
On Wed, Jan 13, 2021 at 04:32:54PM +, Luck, Tony wrote: > I'll work on it. We tend not to have essay length messages as panic() > strings. But I can > add a comment in the code there so that people who grep whatever panic message > we choose can get more details on what happened and what to do

RE: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Luck, Tony
>> Maybe the other difference in approach between Andy and me is whether to >> go for a solution that covers all the corner cases, or just make an >> incremental >> improvement that allows for recover in some useful subset of remaining fatal >> cases, but still dies in other cases. > > Does that m

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Borislav Petkov
On Wed, Jan 13, 2021 at 04:06:09PM +, Luck, Tony wrote: > Maybe the other difference in approach between Andy and me is whether to > go for a solution that covers all the corner cases, or just make an > incremental > improvement that allows for recover in some useful subset of remaining fatal

RE: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Luck, Tony
> I think you guys have veered off into the weeds with this. The current > solution - modulo error messages not destined for humans :) - is not soo > bad, considering the whole MCA trainwreck. > > Or am I missing something completely unacceptable? Maybe the other difference in approach between And

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-13 Thread Borislav Petkov
On Tue, Jan 12, 2021 at 08:15:46PM -0800, Andy Lutomirski wrote: > So I’m sort of at a loss as to what we can do. I think you guys have veered off into the weeds with this. The current solution - modulo error messages not destined for humans :) - is not soo bad, considering the whole MCA trainwrec

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Andy Lutomirski
> On Jan 12, 2021, at 5:50 PM, Luck, Tony wrote: > > On Tue, Jan 12, 2021 at 02:04:55PM -0800, Andy Lutomirski wrote: >>> But we know that the fault happend in a get_user() or copy_from_user() call >>> (i.e. an RIP with an extable recovery address). Does context switch >>> access user memory

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 02:04:55PM -0800, Andy Lutomirski wrote: > > But we know that the fault happend in a get_user() or copy_from_user() call > > (i.e. an RIP with an extable recovery address). Does context switch > > access user memory? > > No, but NMI can. > > The case that would be very ve

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Andy Lutomirski
> On Jan 12, 2021, at 12:52 PM, Luck, Tony wrote: > > On Tue, Jan 12, 2021 at 10:57:07AM -0800, Andy Lutomirski wrote: >>> On Tue, Jan 12, 2021 at 10:24 AM Luck, Tony wrote: >>> >>> On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: Well, we need to do *something* when the

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 10:57:07AM -0800, Andy Lutomirski wrote: > On Tue, Jan 12, 2021 at 10:24 AM Luck, Tony wrote: > > > > On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: > > > Well, we need to do *something* when the first __get_user() trips the > > > #MC. It would be nice if

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Andy Lutomirski
On Tue, Jan 12, 2021 at 10:24 AM Luck, Tony wrote: > > On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: > > Well, we need to do *something* when the first __get_user() trips the > > #MC. It would be nice if we could actually fix up the page tables > > inside the #MC handler, but,

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: > Well, we need to do *something* when the first __get_user() trips the > #MC. It would be nice if we could actually fix up the page tables > inside the #MC handler, but, if we're in a pagefault_disable() context > we might have lock

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Andy Lutomirski
On Tue, Jan 12, 2021 at 9:16 AM Luck, Tony wrote: > > On Tue, Jan 12, 2021 at 09:00:14AM -0800, Andy Lutomirski wrote: > > > On Jan 11, 2021, at 2:21 PM, Luck, Tony wrote: > > > > > > On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > > >> > > On Jan 11, 2021, at 1:45 PM, To

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Luck, Tony
On Tue, Jan 12, 2021 at 09:00:14AM -0800, Andy Lutomirski wrote: > > On Jan 11, 2021, at 2:21 PM, Luck, Tony wrote: > > > > On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > >> > On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: > >>> > >>> Recovery action when get_user() trig

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-12 Thread Andy Lutomirski
> On Jan 11, 2021, at 2:21 PM, Luck, Tony wrote: > > On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: >> On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: >>> >>> Recovery action when get_user() triggers a machine check uses the fixup >>> path to make get_user() return -EFAULT.

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-11 Thread Luck, Tony
On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > > > On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: > > > > Recovery action when get_user() triggers a machine check uses the fixup > > path to make get_user() return -EFAULT. Also queue_task_work() sets up > > so that kill_me_ma

Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-11 Thread Andy Lutomirski
> On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: > > Recovery action when get_user() triggers a machine check uses the fixup > path to make get_user() return -EFAULT. Also queue_task_work() sets up > so that kill_me_maybe() will be called on return to user mode to send a > SIGBUS to the curren

[PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-11 Thread Tony Luck
Recovery action when get_user() triggers a machine check uses the fixup path to make get_user() return -EFAULT. Also queue_task_work() sets up so that kill_me_maybe() will be called on return to user mode to send a SIGBUS to the current process. But there are places in the kernel where the code a