On 11.12.2020 08:02, Jürgen Groß wrote:
> On 10.12.20 21:51, Julien Grall wrote:
>> Hi Jan,
>>
>> On 09/12/2020 14:29, Jan Beulich wrote:
>>> On 09.12.2020 13:11, Julien Grall wrote:
>>>> On 26/11/2020 11:20, Jan Beulich wrote:
>>>>> On 26.11.2020 09:03, Juergen Gross wrote:
>>>>>> When the host crashes it would sometimes be nice to have additional
>>>>>> debug data available which could be produced via debug keys, but
>>>>>> halting the server for manual intervention might be impossible due to
>>>>>> the need to reboot/kexec rather sooner than later.
>>>>>>
>>>>>> Add support for automatic debug key actions in case of crashes which
>>>>>> can be activated via boot- or runtime-parameter.
>>>>>>
>>>>>> Depending on the type of crash the desired data might be different, so
>>>>>> support different settings for the possible types of crashes.
>>>>>>
>>>>>> The parameter is "crash-debug" with the following syntax:
>>>>>>
>>>>>>     crash-debug-<type>=<string>
>>>>>>
>>>>>> with <type> being one of:
>>>>>>
>>>>>>     panic, hwdom, watchdog, kexeccmd, debugkey
>>>>>>
>>>>>> and <string> a sequence of debug key characters with '+' having the
>>>>>> special semantics of a 10 millisecond pause.
>>>>>>
>>>>>> So "crash-debug-watchdog=0+0qr" would result in special output in case
>>>>>> of watchdog triggered crash (dom0 state, 10 ms pause, dom0 state,
>>>>>> domain info, run queues).
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgr...@suse.com>
>>>>>> ---
>>>>>> V2:
>>>>>> - switched special character '.' to '+' (Jan Beulich)
>>>>>> - 10 ms instead of 1 s pause (Jan Beulich)
>>>>>> - added more text to the boot parameter description (Jan Beulich)
>>>>>>
>>>>>> V3:
>>>>>> - added const (Jan Beulich)
>>>>>> - thorough test of crash reason parameter (Jan Beulich)
>>>>>> - kexeccmd case should depend on CONFIG_KEXEC (Jan Beulich)
>>>>>> - added dummy get_irq_regs() helper on Arm
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgr...@suse.com>
>>>>>
>>>>> Except for the Arm aspect, where I'm not sure using
>>>>> guest_cpu_user_regs() is correct in all cases,
>>>>
>>>> I am not entirely sure to understand what get_irq_regs() is supposed to
>>>> returned on x86. Is it the registers saved from the most recent 
>>>> exception?
>>>
>>> An interrupt (not an exception) sets the underlying per-CPU
>>> variable, such that interested parties will know the real
>>> context is not guest or "normal" Xen code, but an IRQ.
>>
>> Thanks for the explanation. I am a bit confused to why we need to give a 
>> regs to handle_keypress() because no-one seems to use it. Do you have an 
>> explanation?
> 
> dump_registers() (key 'd') is using it.
> 
>>
>> To add to the confusion, it looks like that get_irqs_regs() may return 
>> NULL. So sometimes we may pass guest_cpu_regs() (which may contain 
>> garbagge or a set too far).
> 
> I guess this is a best effort approach.

Indeed. If there are ways to make it "more best", we should of
course follow them. (Except before Dom0 starts, I'm afraid I
don't see though where garbage would come from. And even then,
just like for the idle vCPU-s, it shouldn't really be garbage,
or else this suggests missing initialization somewhere.)

>> I guess providing the wrong information to handle_keypress() is not 
>> going to matter that much because no-one use it (?). Although, I'd like 
>> to make sure this is not going to bite us in the future.
> 
> TBH using the 'd' handler isn't making that much sense, as the
> information delivered would be of interest only in case of a panic(),
> which is already printing that information.

I disagree. I've had numerous cases where I found this key very
useful. Or do you mean what you say just for the new purpose
added here?

Jan

Reply via email to