Re: [PATCH 3/3] x86/pv: Don't use IST for NMI/#MC/#DB in !CONFIG_PV builds

2020-04-24 Thread Jan Beulich
On 23.04.2020 20:49, Andrew Cooper wrote:
> On 21/04/2020 08:48, Jan Beulich wrote:
>> On 20.04.2020 16:59, Andrew Cooper wrote:
>>> --- a/xen/include/asm-x86/processor.h
>>> +++ b/xen/include/asm-x86/processor.h
>>> @@ -441,12 +441,18 @@ struct tss_page {
>>>  };
>>>  DECLARE_PER_CPU(struct tss_page, tss_page);
>>>  
>>> +/*
>>> + * Interrupt Stack Tables.  Used to force a stack switch on a CPL0=>0
>>> + * interrupt/exception.  #DF uses IST all the time to detect stack 
>>> overflows
>>> + * cleanly.  NMI/#MC/#DB only need IST to cover the SYSCALL gap, and 
>>> therefore
>>> + * only necessary with PV guests.
>>> + */
>> Is it really only the SYSCALL gap that we mean to cover? In particular
>> for #MC I'd suspect it is a good idea to switch stacks as well, to get
>> onto a distinct memory page in case the #MC was stack related.
> 
> If #MC occurs on your stack, you have already lost.  The chances of only
> taking a single #MC are tiny because the next-line prefetcher will be
> doing its job (or it hits when the lines (plural) leave L3, which will
> be in guest context at some point in the future.)

It would seem fishy behavior to me for the hardware to continue
issuing prefetches against a page that has just been noticed to
cause issues when accessed. Surely hardware at least _could_
internally mark such a page #UC or some such, to prevent further
prefetches to actually hit the bus.

> The very best you can hope for is to cleanly print something and crash -
> even if you manage to run the handler, you've got no idea if the
> interrupted context had a spinlock held, and Xen has no support for
> changing to a different pcpu stack.

But getting something printed is magnitudes better that hanging
or rebooting entirely silently.

Nevertheless, as long as you extend the description somewhat to
express this reasoning in some way
Reviewed-by: Jan Beulich 
(a little hesitantly though).

Jan



Re: [PATCH 3/3] x86/pv: Don't use IST for NMI/#MC/#DB in !CONFIG_PV builds

2020-04-23 Thread Andrew Cooper
On 21/04/2020 08:48, Jan Beulich wrote:
> On 20.04.2020 16:59, Andrew Cooper wrote:
>> --- a/xen/include/asm-x86/processor.h
>> +++ b/xen/include/asm-x86/processor.h
>> @@ -441,12 +441,18 @@ struct tss_page {
>>  };
>>  DECLARE_PER_CPU(struct tss_page, tss_page);
>>  
>> +/*
>> + * Interrupt Stack Tables.  Used to force a stack switch on a CPL0=>0
>> + * interrupt/exception.  #DF uses IST all the time to detect stack overflows
>> + * cleanly.  NMI/#MC/#DB only need IST to cover the SYSCALL gap, and 
>> therefore
>> + * only necessary with PV guests.
>> + */
> Is it really only the SYSCALL gap that we mean to cover? In particular
> for #MC I'd suspect it is a good idea to switch stacks as well, to get
> onto a distinct memory page in case the #MC was stack related.

If #MC occurs on your stack, you have already lost.  The chances of only
taking a single #MC are tiny because the next-line prefetcher will be
doing its job (or it hits when the lines (plural) leave L3, which will
be in guest context at some point in the future.)

The very best you can hope for is to cleanly print something and crash -
even if you manage to run the handler, you've got no idea if the
interrupted context had a spinlock held, and Xen has no support for
changing to a different pcpu stack.

> With NMI it might as well be better to switch;

Why?  There is no benefit (with no SYSCALL in the picture), and a
downside which causes state loss.

>  I agree we don't need any
> switching for #DB.
>
> I also think that the comment at the top of current.h would want
> updating with these adjustments (which I notice lacks the #DB part
> anyway).

Oops - I totally forgot that for the XSA-260 fix.

~Andrew



Re: [PATCH 3/3] x86/pv: Don't use IST for NMI/#MC/#DB in !CONFIG_PV builds

2020-04-21 Thread Jan Beulich
On 20.04.2020 16:59, Andrew Cooper wrote:
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -441,12 +441,18 @@ struct tss_page {
>  };
>  DECLARE_PER_CPU(struct tss_page, tss_page);
>  
> +/*
> + * Interrupt Stack Tables.  Used to force a stack switch on a CPL0=>0
> + * interrupt/exception.  #DF uses IST all the time to detect stack overflows
> + * cleanly.  NMI/#MC/#DB only need IST to cover the SYSCALL gap, and 
> therefore
> + * only necessary with PV guests.
> + */

Is it really only the SYSCALL gap that we mean to cover? In particular
for #MC I'd suspect it is a good idea to switch stacks as well, to get
onto a distinct memory page in case the #MC was stack related. With
NMI it might as well be better to switch; I agree we don't need any
switching for #DB.

I also think that the comment at the top of current.h would want
updating with these adjustments (which I notice lacks the #DB part
anyway).

Jan