RE: [PATCH V3 10/11] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers

2022-10-21 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Tuesday, October 4, 2022 
10:20 AM
> 
> On 04/10/2022 13:24, Michael Kelley (LINUX) wrote:
> > [...]
> >
> > Tested this patch in combination with Patch 9 in this series.  Verified
> > that both the panic and die paths work correctly with notification to
> > Hyper-V via hyperv_report_panic() or via hv_kmsg_dump().  Hyper-V
> > framebuffer is updated as expected, though I did not reproduce
> > a case where the ring buffer lock is held.  vmbus_initiate_unload() runs
> > as expected.
> >
> > Tested-by: Michael Kelley 
> >
> 
> Thanks a lot for the tests/review Michael!
> 
> Do you think Hyper-V folks could add both patches in hv tree? If you
> prefer, I can re-send them individually.
> 

Wei Liu:  Could you pick up Patch 9 and Patch 10 from this series in the
hyperv-next tree?

Michael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH V3 09/11] video/hyperv_fb: Avoid taking busy spinlock on panic path

2022-10-05 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Friday, August 19, 2022 
3:17 PM
> 
> The Hyper-V framebuffer code registers a panic notifier in order
> to try updating its fbdev if the kernel crashed. The notifier
> callback is straightforward, but it calls the vmbus_sendpacket()
> routine eventually, and such function takes a spinlock for the
> ring buffer operations.
> 
> Panic path runs in atomic context, with local interrupts and
> preemption disabled, and all secondary CPUs shutdown. That said,
> taking a spinlock might cause a lockup if a secondary CPU was
> disabled with such lock taken. Fix it here by checking if the
> ring buffer spinlock is busy on Hyper-V framebuffer panic notifier;
> if so, bail-out avoiding the potential lockup scenario.
> 
> Cc: Andrea Parri (Microsoft) 
> Cc: Dexuan Cui 
> Cc: Haiyang Zhang 
> Cc: "K. Y. Srinivasan" 
> Cc: Michael Kelley 
> Cc: Stephen Hemminger 
> Cc: Tianyu Lan 
> Cc: Wei Liu 
> Tested-by: Fabio A M Martins 
> Signed-off-by: Guilherme G. Piccoli 
> 
> ---
> 
> V3:
> - simplified the code based on Michael's suggestion - thanks!
> 
> V2:
> - new patch, based on the discussion in [0].
> [0] 
> https://lore.kernel.org/lkml/2787b476-6366-1c83-db80-0393da417...@igalia.com/
> 
> 
>  drivers/hv/ring_buffer.c| 13 +
>  drivers/video/fbdev/hyperv_fb.c |  8 +++-
>  include/linux/hyperv.h  |  2 ++
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 59a4aa86d1f3..c6692fd5ab15 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -280,6 +280,19 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info
> *ring_info)
>   ring_info->pkt_buffer_size = 0;
>  }
> 
> +/*
> + * Check if the ring buffer spinlock is available to take or not; used on
> + * atomic contexts, like panic path (see the Hyper-V framebuffer driver).
> + */
> +
> +bool hv_ringbuffer_spinlock_busy(struct vmbus_channel *channel)
> +{
> + struct hv_ring_buffer_info *rinfo = >outbound;
> +
> + return spin_is_locked(>ring_lock);
> +}
> +EXPORT_SYMBOL_GPL(hv_ringbuffer_spinlock_busy);
> +
>  /* Write to the ring buffer. */
>  int hv_ringbuffer_write(struct vmbus_channel *channel,
>   const struct kvec *kv_list, u32 kv_count,
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index 886c564787f1..e1b65a01fb96 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -783,12 +783,18 @@ static void hvfb_ondemand_refresh_throttle(struct
> hvfb_par *par,
>  static int hvfb_on_panic(struct notifier_block *nb,
>unsigned long e, void *p)
>  {
> + struct hv_device *hdev;
>   struct hvfb_par *par;
>   struct fb_info *info;
> 
>   par = container_of(nb, struct hvfb_par, hvfb_panic_nb);
> - par->synchronous_fb = true;
>   info = par->info;
> + hdev = device_to_hv_device(info->device);
> +
> + if (hv_ringbuffer_spinlock_busy(hdev->channel))
> + return NOTIFY_DONE;
> +
> + par->synchronous_fb = true;
>   if (par->need_docopy)
>   hvfb_docopy(par, 0, dio_fb_size);
>   synthvid_update(info, 0, 0, INT_MAX, INT_MAX);
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 3b42264333ef..646f1da9f27e 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1341,6 +1341,8 @@ struct hv_ring_buffer_debug_info {
>  int hv_ringbuffer_get_debuginfo(struct hv_ring_buffer_info *ring_info,
>   struct hv_ring_buffer_debug_info *debug_info);
> 
> +bool hv_ringbuffer_spinlock_busy(struct vmbus_channel *channel);
> +
>  /* Vmbus interface */
>  #define vmbus_driver_register(driver)\
>   __vmbus_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
> --
> 2.37.2

Reviewed-by: Michael Kelley 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH V3 10/11] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers

2022-10-05 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Friday, August 19, 2022 
3:18 PM
> 
> Currently Hyper-V guests are among the most relevant users of the panic
> infrastructure, like panic notifiers, kmsg dumpers, etc. The reasons rely
> both in cleaning-up procedures (closing hypervisor <-> guest connection,
> disabling some paravirtualized timer) as well as to data collection
> (sending panic information to the hypervisor) and framebuffer management.
> 
> The thing is: some notifiers are related to others, ordering matters, some
> functionalities are duplicated and there are lots of conditionals behind
> sending panic information to the hypervisor. As part of an effort to
> clean-up the panic notifiers mechanism and better document things, we
> hereby address some of the issues/complexities of Hyper-V panic handling
> through the following changes:
> 
> (a) We have die and panic notifiers on vmbus_drv.c and both have goals of
> sending panic information to the hypervisor, though the panic notifier is
> also responsible for a cleaning-up procedure.
> 
> This commit clears the code by splitting the panic notifier in two, one
> for closing the vmbus connection whereas the other is only for sending
> panic info to hypervisor. With that, it was possible to merge the die and
> panic notifiers in a single/well-documented function, and clear some
> conditional complexities on sending such information to the hypervisor.
> 
> (b) There is a Hyper-V framebuffer panic notifier, which relies in doing
> a vmbus operation that demands a valid connection. So, we must order this
> notifier with the panic notifier from vmbus_drv.c, to guarantee that the
> framebuffer code executes before the vmbus connection is unloaded.
> 
> Also, this commit removes a useless header.
> 
> Although there is code rework and re-ordering, we expect that this change
> has no functional regressions but instead optimize the path and increase
> panic reliability on Hyper-V. This was tested on Hyper-V with success.
> 
> Cc: Andrea Parri (Microsoft) 
> Cc: Dexuan Cui 
> Cc: Haiyang Zhang 
> Cc: "K. Y. Srinivasan" 
> Cc: Petr Mladek 
> Cc: Stephen Hemminger 
> Cc: Tianyu Lan 
> Cc: Wei Liu 
> Reviewed-by: Michael Kelley 
> Tested-by: Fabio A M Martins 
> Signed-off-by: Guilherme G. Piccoli 
> 
> ---
> 
> V3:
> - Added Michael's review tag - thanks!
> 
> V2:
> - Unfortunately we cannot rely in the crash shutdown (custom) handler
> to perform the vmbus unload - arm64 architecture doesn't have this
> "feature" [0]. So, in V2 we kept the notifier behavior and always
> unload the vmbus connection, no matter what - thanks Michael for
> pointing that;
> 
> - Removed the Fixes tags as per Michael suggestion;
> 
> - As per Petr suggestion, we abandoned the idea of distinguish among
> notifiers using an id - so, in V2 we rely in the old and good address
> comparison for that. Thanks Petr for the enriching discussion!
> 
> [0] 
> https://lore.kernel.org/lkml/427a8277-49f0-4317-d6c3-4a15d7070...@igalia.com/
> 
> 
>  drivers/hv/vmbus_drv.c  | 109 +++-
>  drivers/video/fbdev/hyperv_fb.c |   8 +++
>  2 files changed, 74 insertions(+), 43 deletions(-)
> 

Tested this patch in combination with Patch 9 in this series.  Verified
that both the panic and die paths work correctly with notification to
Hyper-V via hyperv_report_panic() or via hv_kmsg_dump().  Hyper-V
framebuffer is updated as expected, though I did not reproduce
a case where the ring buffer lock is held.  vmbus_initiate_unload() runs
as expected.

Tested-by: Michael Kelley 

> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 23c680d1a0f5..0a77e2bb0b70 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -25,7 +25,6 @@
>  #include 
> 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -69,53 +68,74 @@ static int hyperv_report_reg(void)
>   return !sysctl_record_panic_msg || !hv_panic_page;
>  }
> 
> -static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
> +/*
> + * The panic notifier below is responsible solely for unloading the
> + * vmbus connection, which is necessary in a panic event.
> + *
> + * Notice an intrincate relation of this notifier with Hyper-V
> + * framebuffer panic notifier exists - we need vmbus connection alive
> + * there in order to succeed, so we need to order both with each other
> + * [see hvfb_on_panic()] - this is done using notifiers' priorities.
> + */
> +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long 
> val,
> void *args)
> +{
> + vmbus_initiate_unload(true);
> + return NOTIFY_DONE;
> +}
> +static struct notifier_block hyperv_panic_vmbus_unload_block = {
> + .notifier_call  = hv_panic_vmbus_unload,
> + .priority   = INT_MIN + 1, /* almost the latest one to execute */
> +};
> +
> +static int hv_die_panic_notify_crash(struct notifier_block *self,
> +  unsigned long 

RE: [PATCH] Drivers: hv: vmbus: Fix potential crash on module unload

2022-03-19 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Tuesday, March 15, 2022 
1:36 PM
> 
> The vmbus driver relies on the panic notifier infrastructure to perform
> some operations when a panic event is detected. Since vmbus can be built
> as module, it is required that the driver handles both registering and
> unregistering such panic notifier callback.
> 
> After commit 74347a99e73a ("x86/Hyper-V: Unload vmbus channel in hv panic
> callback")
> though, the panic notifier registration is done unconditionally in the module
> initialization routine whereas the unregistering procedure is conditionally
> guarded and executes only if HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE capability
> is set.
> 
> This patch fixes that by unconditionally unregistering the panic notifier
> in the module's exit routine as well.
> 
> Fixes: 74347a99e73a ("x86/Hyper-V: Unload vmbus channel in hv panic callback")
> Signed-off-by: Guilherme G. Piccoli 
> ---
> 
> 
> Hi folks, thanks in advance for any reviews! This was build-tested
> with Debian config, on 5.17-rc7.
> 
> This patch is a result of code analysis - I didn't experience this
> issue but seems a valid/feasible case.
> 
> Also, this is part of an ongoing effort of clearing/refactoring the panic
> notifiers, more will be done soon, but I prefer to send the simple bug
> fixes quickly, or else it might take a while since the next steps are more
> complex and subject to many iterations I expect.
> 
> Cheers,
> 
> Guilherme
> 
> 
>  drivers/hv/vmbus_drv.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 12a2b37e87f3..12585324cc4a 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2780,10 +2780,15 @@ static void __exit vmbus_exit(void)
>   if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
>   kmsg_dump_unregister(_kmsg_dumper);
>   unregister_die_notifier(_die_block);
> - atomic_notifier_chain_unregister(_notifier_list,
> -  _panic_block);
>   }
> 
> + /*
> +  * The panic notifier is always registered, hence we should
> +  * also unconditionally unregister it here as well.
> +  */
> + atomic_notifier_chain_unregister(_notifier_list,
> +  _panic_block);
> +
>   free_page((unsigned long)hv_panic_page);
>   unregister_sysctl_table(hv_ctl_table_hdr);
>   hv_ctl_table_hdr = NULL;
> --
> 2.35.1

Reviewed-by: Michael Kelley 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

2022-02-10 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Thursday, February 10, 
2022 8:40 AM
> 
> On 08/02/2022 21:31, b...@redhat.com wrote:
> > [...]
> >> So, what are the opinions from kdump maintainers about this idea?
> >> Baoquan / Vivek / Dave, does it make sense to you? Do you have any
> >> suggestions/concerns to add on top of Petr draft?
> >
> > Yeah, it's reasonable. As I replied to Michael in another thread, I
> > think splitting the current notifier list is a good idea. At least the
> > code to archieve hyper-V's goal with panic_notifier is a little odd and
> > should be taken out and execute w/o conditional before kdump, and maybe
> > some others Petr has combed out.
> >
> > For those which will be switched on with the need of adding panic_notifier
> > or panic_print into cmdline, the heavy users like HATAYAMA and Masa can
> > help check.
> >
> > For Petr's draft code, does it mean hyper-V need another knob to trigger
> > the needed notifiers? Will you go with the draft direclty? Hyper-V now
> > runs panic notifiers by default, just a reminder.
> >
> 
> Hi Baoquan, thanks for your comments.
> 
> I'll need to study the Hyper-V code and how it's done today -

Let me know if you need any assistance or explanation as you look
at the Hyper-V code.

Michael Kelley
Principal SW Engineer
Linux Systems Group
Microsoft Corporation

> I guess
> most part of this implementation will be studying the notifiers we have
> currently, split them among the 3 new notifiers and comb them into
> patches, so they can be reviewed for all relevant maintainers (who know
> the code we are changing).
> 
> I'm not sure if I go directly with the draft, likely it'll have some
> changes, but the draft should be the skeleton of the new implementation.
> Specially if you/other kdump maintainers agree it's a good idea =)
> 
> Cheers,
> 
> 
> Guilherme
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH V3] panic: Move panic_print before kmsg dumpers

2022-02-02 Thread Michael Kelley (LINUX)
From: Baoquan He  Sent: Saturday, January 29, 2022 12:00 AM
> 
> On 01/26/22 at 12:51pm, Petr Mladek wrote:
> > On Mon 2022-01-24 16:57:17, Michael Kelley (LINUX) wrote:
> > > From: Baoquan He  Sent: Friday, January 21, 2022 8:34 PM
> > > >
> > > > On 01/21/22 at 03:00pm, Michael Kelley (LINUX) wrote:
> > > > > From: Baoquan He  Sent: Thursday, January 20, 2022 
> > > > > 6:31 PM
> > > > > >
> > > > > > On 01/20/22 at 06:36pm, Guilherme G. Piccoli wrote:
> > > > > > > Hi Baoquan, some comments inline below:
> > > > > > >
> > > > > > > On 20/01/2022 05:51, Baoquan He wrote:
> > >
> > > [snip]
> > >
> > > > > > > Do you think it should be necessary?
> > > > > > > How about if we allow users to just "panic_print" with or without 
> > > > > > > the
> > > > > > > "crash_kexec_post_notifiers", then we pursue Petr suggestion of
> > > > > > > refactoring the panic notifiers? So, after this future refactor, 
> > > > > > > we
> > > > > > > might have a much clear code.
> > > > > >
> > > > > > I haven't read Petr's reply in another panic notifier filter 
> > > > > > thread. For
> > > > > > panic notifier, it's only enforced to use on HyperV platform, 
> > > > > > excepto of
> > > > > > that, users need to explicitly add "crash_kexec_post_notifiers=1" 
> > > > > > to enable
> > > > > > it. And we got bug report on the HyperV issue. In our internal 
> > > > > > discussion,
> > > > > > we strongly suggest HyperV dev to change the default enablement, 
> > > > > > instead
> > > > > > leave it to user to decide.
> > > > > >
> > > > >
> > > > > Regarding Hyper-V:   Invoking the Hyper-V notifier prior to running 
> > > > > the
> > > > > kdump kernel is necessary for correctness.  During initial boot of the
> > > > > main kernel, the Hyper-V and VMbus code in Linux sets up several guest
> > > > > physical memory pages that are shared with Hyper-V, and that Hyper-V
> > > > > may write to.   A VMbus connection is also established. Before 
> > > > > kexec'ing
> > > > > into the kdump kernel, the sharing of these pages must be rescinded
> > > > > and the VMbus connection must be terminated.   If this isn't done, the
> > > > > kdump kernel will see strange memory overwrites if these shared guest
> > > > > physical memory pages get used for something else.
> > >
> > > In the Azure cloud, collecting data before crash dumps is a motivation
> > > as well for setting crash_kexec_post_notifiers to true.   That way as
> > > cloud operator we can see broad failure trends, and in specific cases
> > > customers often expect the cloud operator to be able to provide info
> > > about a problem even if they have taken a kdump.  Where did you
> > > envision adding a comment in the code to help clarify these intentions?
> > >
> > > I looked at the code again, and should revise my previous comments
> > > somewhat.   The Hyper-V resets that I described indeed must be done
> > > prior to kexec'ing the kdump kernel.   Most such resets are actually
> > > done via __crash_kexec() -> machine_crash_shutdown(), not via the
> > > panic notifier. However, the Hyper-V panic notifier must terminate the
> > > VMbus connection, because that must be done even if kdump is not
> > > being invoked.  See commit 74347a99e73.
> > >
> > > Most of the hangs seen in getting into the kdump kernel on Hyper-V/Azure
> > > were probably due to the machine_crash_shutdown() path, and not due
> > > to running the panic notifiers prior to kexec'ing the kdump kernel.  The
> > > exception is terminating the VMbus connection, which had problems that
> > > are hopefully now fixed because of adding a timeout.
> >
> > My undestanding is that we could split the actions into three groups:
> >
> >   1. Actions that has to be before kexec'ing kdump kernel, like
> >  resetting physicall memory shared with Hyper-V.
> >
> >  These operation(s) are needed only for kexec and can be done
> >  in kexec.
> >
> >
> >2. Notify Hyper-V so that,

RE: [PATCH V3] panic: Move panic_print before kmsg dumpers

2022-01-28 Thread Michael Kelley (LINUX)
From: Baoquan He  Sent: Friday, January 28, 2022 1:03 AM
> 
> On 01/24/22 at 04:57pm, Michael Kelley (LINUX) wrote:
> > From: Baoquan He  Sent: Friday, January 21, 2022 8:34 PM
> > >
> > > On 01/21/22 at 03:00pm, Michael Kelley (LINUX) wrote:
> > > > From: Baoquan He  Sent: Thursday, January 20, 2022 
> > > > 6:31 PM
> > > > >
> > > > > On 01/20/22 at 06:36pm, Guilherme G. Piccoli wrote:
> > > > > > Hi Baoquan, some comments inline below:
> > > > > >
> > > > > > On 20/01/2022 05:51, Baoquan He wrote:
> >
> > [snip]
> >
> > > > > > Do you think it should be necessary?
> > > > > > How about if we allow users to just "panic_print" with or without 
> > > > > > the
> > > > > > "crash_kexec_post_notifiers", then we pursue Petr suggestion of
> > > > > > refactoring the panic notifiers? So, after this future refactor, we
> > > > > > might have a much clear code.
> > > > >
> > > > > I haven't read Petr's reply in another panic notifier filter thread. 
> > > > > For
> > > > > panic notifier, it's only enforced to use on HyperV platform, excepto 
> > > > > of
> > > > > that, users need to explicitly add "crash_kexec_post_notifiers=1" to 
> > > > > enable
> > > > > it. And we got bug report on the HyperV issue. In our internal 
> > > > > discussion,
> > > > > we strongly suggest HyperV dev to change the default enablement, 
> > > > > instead
> > > > > leave it to user to decide.
> > > > >
> > > >
> > > > Regarding Hyper-V:   Invoking the Hyper-V notifier prior to running the
> > > > kdump kernel is necessary for correctness.  During initial boot of the
> > > > main kernel, the Hyper-V and VMbus code in Linux sets up several guest
> > > > physical memory pages that are shared with Hyper-V, and that Hyper-V
> > > > may write to.   A VMbus connection is also established. Before kexec'ing
> > > > into the kdump kernel, the sharing of these pages must be rescinded
> > > > and the VMbus connection must be terminated.   If this isn't done, the
> > > > kdump kernel will see strange memory overwrites if these shared guest
> > > > physical memory pages get used for something else.
> > > >
> > > > I hope we've found and fixed all the problems where the Hyper-V
> > > > notifier could get hung.  Unfortunately, the Hyper-V interfaces were
> > > > designed long ago without the Linux kexec scenario in mind, and they
> > > > don't provide a simple way to reset everything except by doing a
> > > > reboot that goes back through the virtual BIOS/UEFI.  So the Hyper-V
> > > > notifier code is more complicated than would be desirable, and in
> > > > particular, terminating the VMbus connection is tricky.
> > > >
> > > > This has been an evolving area of understanding.  It's only been the 
> > > > last
> > > > couple of years that we've fully understood the implications of these
> > > > shared memory pages on the kexec/kdump scenario and what it takes
> > > > to reset everything so the kexec'ed kernel will work.
> > >
> > > Glad to know these background details, thx, Michael. While from the
> > > commit which introduced it and the code comment above code, I thought
> > > Hyper-V wants to collect data before crash dump. If this is the true,
> > > it might be helpful to add these in commit log or add as code comment,
> > > and also help to defend you when people question it.
> > >
> > > int __init hv_common_init(void)
> > > {
> > > int i;
> > >
> > > /*
> > >  * Hyper-V expects to get crash register data or kmsg when
> > >  * crash enlightment is available and system crashes. Set
> > >  * crash_kexec_post_notifiers to be true to make sure that
> > >  * calling crash enlightment interface before running kdump
> > >  * kernel.
> > >  */
> > > if (ms_hyperv.misc_features &
> HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
> > > crash_kexec_post_notifiers = true;
> > >
> > >   ..
> > > }
> >
> > In the Azure cloud, collecting data before crash dumps is a motivation
> &g

RE: [PATCH V3] panic: Move panic_print before kmsg dumpers

2022-01-24 Thread Michael Kelley (LINUX)
From: Baoquan He  Sent: Friday, January 21, 2022 8:34 PM
> 
> On 01/21/22 at 03:00pm, Michael Kelley (LINUX) wrote:
> > From: Baoquan He  Sent: Thursday, January 20, 2022 6:31 PM
> > >
> > > On 01/20/22 at 06:36pm, Guilherme G. Piccoli wrote:
> > > > Hi Baoquan, some comments inline below:
> > > >
> > > > On 20/01/2022 05:51, Baoquan He wrote:

[snip]

> > > > Do you think it should be necessary?
> > > > How about if we allow users to just "panic_print" with or without the
> > > > "crash_kexec_post_notifiers", then we pursue Petr suggestion of
> > > > refactoring the panic notifiers? So, after this future refactor, we
> > > > might have a much clear code.
> > >
> > > I haven't read Petr's reply in another panic notifier filter thread. For
> > > panic notifier, it's only enforced to use on HyperV platform, excepto of
> > > that, users need to explicitly add "crash_kexec_post_notifiers=1" to 
> > > enable
> > > it. And we got bug report on the HyperV issue. In our internal discussion,
> > > we strongly suggest HyperV dev to change the default enablement, instead
> > > leave it to user to decide.
> > >
> >
> > Regarding Hyper-V:   Invoking the Hyper-V notifier prior to running the
> > kdump kernel is necessary for correctness.  During initial boot of the
> > main kernel, the Hyper-V and VMbus code in Linux sets up several guest
> > physical memory pages that are shared with Hyper-V, and that Hyper-V
> > may write to.   A VMbus connection is also established. Before kexec'ing
> > into the kdump kernel, the sharing of these pages must be rescinded
> > and the VMbus connection must be terminated.   If this isn't done, the
> > kdump kernel will see strange memory overwrites if these shared guest
> > physical memory pages get used for something else.
> >
> > I hope we've found and fixed all the problems where the Hyper-V
> > notifier could get hung.  Unfortunately, the Hyper-V interfaces were
> > designed long ago without the Linux kexec scenario in mind, and they
> > don't provide a simple way to reset everything except by doing a
> > reboot that goes back through the virtual BIOS/UEFI.  So the Hyper-V
> > notifier code is more complicated than would be desirable, and in
> > particular, terminating the VMbus connection is tricky.
> >
> > This has been an evolving area of understanding.  It's only been the last
> > couple of years that we've fully understood the implications of these
> > shared memory pages on the kexec/kdump scenario and what it takes
> > to reset everything so the kexec'ed kernel will work.
> 
> Glad to know these background details, thx, Michael. While from the
> commit which introduced it and the code comment above code, I thought
> Hyper-V wants to collect data before crash dump. If this is the true,
> it might be helpful to add these in commit log or add as code comment,
> and also help to defend you when people question it.
> 
> int __init hv_common_init(void)
> {
> int i;
> 
> /*
>  * Hyper-V expects to get crash register data or kmsg when
>  * crash enlightment is available and system crashes. Set
>  * crash_kexec_post_notifiers to be true to make sure that
>  * calling crash enlightment interface before running kdump
>  * kernel.
>  */
> if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
> crash_kexec_post_notifiers = true;
> 
>   ..
> }

In the Azure cloud, collecting data before crash dumps is a motivation
as well for setting crash_kexec_post_notifiers to true.   That way as
cloud operator we can see broad failure trends, and in specific cases
customers often expect the cloud operator to be able to provide info
about a problem even if they have taken a kdump.  Where did you
envision adding a comment in the code to help clarify these intentions?

I looked at the code again, and should revise my previous comments
somewhat.   The Hyper-V resets that I described indeed must be done
prior to kexec'ing the kdump kernel.   Most such resets are actually
done via __crash_kexec() -> machine_crash_shutdown(), not via the
panic notifier. However, the Hyper-V panic notifier must terminate the
VMbus connection, because that must be done even if kdump is not
being invoked.  See commit 74347a99e73.

Most of the hangs seen in getting into the kdump kernel on Hyper-V/Azure 
were probably due to the machine_crash_shutdown() path, and not due
to running the panic notifiers prior to kexec'ing the kdump kernel.  The
exception is terminating the VMbus connection, which had problems that
are hopefully now fixed because of adding a timeout.

Michael

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: [PATCH V3] panic: Move panic_print before kmsg dumpers

2022-01-21 Thread Michael Kelley (LINUX)
From: Baoquan He  Sent: Thursday, January 20, 2022 6:31 PM
> 
> On 01/20/22 at 06:36pm, Guilherme G. Piccoli wrote:
> > Hi Baoquan, some comments inline below:
> >
> >
> > On 20/01/2022 05:51, Baoquan He wrote:
> > > [...]
> > >> From my POV, the function of panic notifiers is not well defined. They
> > >> do various things, for example:
> > >> [...]
> > >> The do more that just providing information. Some are risky. It is not
> > >> easy to disable a particular one.
> > >
> > > Yes, agree. Not all of them just provide information.
> > >
> > > Now panic_notifier_filter Guilherme added can help to disable some of
> > > them.
> >
> > So, just for completeness, worth to mention Petr had some interesting
> > suggestions in the other thread (about the filter) and we may end-up not
> > having this implemented - in other words, maybe a refactor of that
> > mechanism is going to be proposed.
> 
> OK, saw that. We can continue discuss that there.
> 
> >
> >
> > > [...]
> > >>
> > >>   + Guilherme uses crash dump only to dump the kernel log. It might
> > >> be more reliable than kmsg_dump. In this case, panic_print_sys_info()
> > >> is the only way to get the extra information.
> > >
> > > Hmm, I haven't made clear what Guilherme really wants in his recent
> > > post. In this patch he wants to get panic print info into pstore. He
> > > also want to dump the kernel log poked by panic_print in kdump kernel.
> > > And it's very weird people try to collect kernel log via crash dump
> > > mechnism, that is obviously using a sledgehammer to crack a nut.
> > > Sometime, we should not add or change code to a too specific corner
> > > case.
> >
> > OK, I'll try to be really clear, hopefully I can explain the use case in
> > better and simpler words. First of all, I wouldn't call it a corner case
> > - it's just a valid use case that, in my opinion, should be allowed. Why
> > not, right? Kernel shouldn't push policy on users, we should instead let
> > the users decide how to use the tools/options.
> 
> Agree, sorry about my wrong expression.
> 
> >
> > So imagine you cannot collect a vmcore, due to the lack of storage
> > space. Yet, you want the most information as possible to investigate the
> > cause of a panic. The kernel flag "panic_print" is the perfect fit, we
> > can dump backtraces, task list, memory info...right on a panic event.
> >
> > But then, how to save this panic log with lots of information after a
> > reboot? There are 2 ways in my understanding:
> >
> > (a) pstore/kmsg_dump()
> > (b) kdump
> >
> > The option (a) is easily the best - we don't need to reserve lots of
> > memory, then boot another kernel, etc. This patch (being hereby
> > discussed) aims to enable the "panic_print" output for this case!
> > But...there are cases in which option (a) cannot work. We need a backend
> > of persistent storage, either a block device or, more common, RAM memory
> > that is persistent across a reboot. What if it's not available?
> >
> > Then, we fallback to option (b) - kind of a sledgehammer, in your words heh
> > It's not ideal, but might be a last resort for users wanting to collect
> > the most information they can without saving a full vmcore. And for
> > that, we need to be able to invoke "panic_print" function before the
> > __crash_kexec() call. Continue below...
> 
> OK, pstore via kmsg_dump is first option, then fallback to kdump.
> This is what I suggested at below. This is what panic notifier has done
> at below. I think both of them are similar, thus should take the same
> way to handle.
> 
>  void panic()
>  {
>  if (!_crash_kexec_post_notifiers && !panic_print) {
>  __crash_kexec(NULL);
>  smp_send_stop();
>  } else {
>  crash_smp_send_stop();
>  }
> 
>   atomic_notifier_call_chain(_notifier_list, 0, buf);
>   panic_print_sys_info(false);
>   kmsg_dump(KMSG_DUMP_PANIC);
>   if (_crash_kexec_post_notifiers || panic_print)
>  __crash_kexec(NULL);
>   ...
>   debug_locks_off();
>  console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> 
>  panic_print_sys_info(true);
>   ..
>  }
> > >
> >
> > So, your idea is good and it mostly works, except it *requires* users to
> > make use of "crash_kexec_post_notifiers" in order to use "panic_print"
> > in the case (b) above discussed.
> 
> I don't get. Why it has to *require* users to make use of
> "crash_kexec_post_notifiers" in order to use "panic_print"?
> To enable panic notifiers and panic_print, we need add below parameter
> to kernel cmdline separately.
> 
>   crash_kexec_post_notifiers=1
> panic_print=0x7f
> 
> With above code, we have:
> 1) None specified in cmdline, only kdump enabled.
>Crash dump will work to get vmcore.
> 2) crash_kexec_post_notifiers=1 , kdump enabled
>panic_notifers are executed, then crash dump
> 3) panic_print=0x7f, kdump enabled,
>Panic_print get system info printed, then