Re: [PATCH 1/2] x86/kexec: Preserve CR4.MCE during kexec

2023-02-15 Thread Edgecombe, Rick P
On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote:
> TDX guests are not allowed to clear CR4.MCE. Attempt to clear it
> leads
> to #VE.
> 
> Preserve the flag during kexec.
> 
> Signed-off-by: Kirill A. Shutemov 

I wonder whats going on with the pre-existing switching between eax and
rax in this code for the cr0 and cr4 manipulations. Do you know what
the reason is?

Also, for a simple non-tdx kexec regression test:
Tested-by: Rick Edgecombe 
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4] panic: Fixes the panic_print NMI backtrace setting

2023-02-15 Thread Guilherme G. Piccoli
On 14/02/2023 11:46, Petr Mladek wrote:
> [...]
>> My understanding is that it's a mechanism to prevent some concurrency,
>> in case some other CPU modify this variable while panic() is running.
>> I find it very unlikely, hence I removed it - but if people consider
>> this copy needed, I can respin this patch and keep it, even providing a
>> comment about that, in order to be explict about its need.
> 
> Yes, I think that it makes the behavior consistent even when the
> global variable manipulated in parallel.
> 
> I would personally prefer to keep the local copy. Better safe
> than sorry.
> 

Hi Petr, thanks for your review!
OK, we could keep this local copy, makes sense...even adding a comment,
to make its purpose really clear.


>> [...]
>> @@ -211,9 +211,6 @@ static void panic_print_sys_info(bool console_flush)
>>  return;
>>  }
>>  
>> -if (panic_print & PANIC_PRINT_ALL_CPU_BT)
>> -trigger_all_cpu_backtrace();
>> -
> 
> Sigh, this is yet another PANIC_PRINT_ action that need special
> timing. We should handle both the same way.
> 
> What about the following? The parameter @mask says what
> actions are allowed at the given time.
> < ..code..> 

I think your approach is interesting, it's very "organized".

But I think it's a bit conflicting with that purpose we had on notifiers
refactor, to deprecate "bogus" usages of panic_print, as in
https://lore.kernel.org/lkml/20220427224924.592546-26-gpicc...@igalia.com/ .

So, the idea of my approach is to allow:

(a) Easy removal of panic_print_sys_info() of panic(), once we move it
to a panic notifier;

(b) Better separate and identify the "bogus" cases. The CPU backtrace
one is less a bogus case in my opinion, more a "complicated" one, since
it's related with the CPUs stop routines. But the console flush, as we
discussed, it's clearly something that calls for a new parameter (and
such param was added in the refactor patch).


In the end, I think your approach is interesting but it's kinda like
we're adding the fix to later, on refactor, entirely remove/rework it.
With my approach we wouldn't be calling panic_print_sys_info() again
(3rd time!) on panic(), and also would be more natural to move it later
to a new panic notifier.

What you / others think? If your approach is in the end preferred, it's
fine by me - I'd just ask you to submit as a full patch so we can get it
merged as a fix in 6.3, if possible (and backport it to the 6.1/6.2
stable). Now, if my approach is fine, I can resubmit as a V5 keeping the
local variable - lemme know.

Cheers,


Guilherme

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec