Re: Can't capture vmcore?

2018-01-16 Thread Dave Young
On 01/17/18 at 09:31am, Dave Young wrote:
> Don, thanks for ccing me.
> On 01/16/18 at 07:47am, Don Zickus wrote:
> > (cc'ing Dae Young)
> > 
> > On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote:
> > > On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout  wrote:
> > > > I'm getting kernel panics in a VM that functions as a hypervisor, the 
> > > > moment
> > > > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is
> > > > annoying, of course, so I try to be a good citizen and file a bug.
> > > >
> > > > For some reason though, I cannot get the core dumped. I get a core fine 
> > > > with
> > > > sysrq, but not with this actual panic. I've followed [1] to set up 
> > > > kdump and
> > > > crash, but everytime I trigger the crash and see my VM reboot, I see an
> > > > empty /var/crash afterwards.
> > > >
> > > > As was able to get the vmcore written to /var/crash on in a RHEL7 
> > > > guest, I'm
> > > > starting to suspect a bug, but I'm unsure.
> 
> One thing need check is if kdump service started successfully before the
> crash, ie. check /sys/kernel/kexec_crash_loaded. 
> 
> If use self-build kernel, you can check to use below patch for testing:
> 
> ---
> It is useful to print kdump kernel loaded status in dump_stack() 
> especially when panic happens so that we can  differenciate 
> kdump kernel early hang and a normal panic in a bug report.
> 
> Signed-off-by: Dave Young 
> ---
>  kernel/printk/printk.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> --- linux-x86.orig/kernel/printk/printk.c
> +++ linux-x86/kernel/printk/printk.c
> @@ -48,6 +48,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l
>   if (dump_stack_arch_desc_str[0] != '\0')
>   printk("%sHardware name: %s\n",
>  log_lvl, dump_stack_arch_desc_str);
> + if (kexec_crash_loaded())
> + printk("%skdump kernel loaded\n", log_lvl);
>  
>   print_worker_info(log_lvl, current);
>  }
> 
> > > >
> > > > Any pointers on how to debug this?
> > > >
> > > > [1] 
> > > > https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
> > > 
> > > Adding the Fedora kernel list.
> > > 
> > > Kdump isn't automatically tested in Fedora and while it can work, it
> > > can often be broken as well.  There might be someone on the kernel
> > > list that is more familiar with the current state of kdump support in
> > > Fedora, or alternative methods for getting the kernel backtrace.
> 
> Yes, since Fedora kernel updates frequently, it is not a surprise that
> kdump does not work.  But it is always good to report a bug against
> "kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent.

Hmm, I noticed in bugzilla there is no such subcomponent for Fedora
if so the kdump bugs can be routed to "kexec-tools" so that we can
be aware about them.

> 
> > > 
> > > josh
> > > ___
> > > kernel mailing list -- ker...@lists.fedoraproject.org
> > > To unsubscribe send an email to kernel-le...@lists.fedoraproject.org
> 
> Thanks
> Dave
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Can't capture vmcore?

2018-01-16 Thread Dave Young
Don, thanks for ccing me.
On 01/16/18 at 07:47am, Don Zickus wrote:
> (cc'ing Dae Young)
> 
> On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote:
> > On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout  wrote:
> > > I'm getting kernel panics in a VM that functions as a hypervisor, the 
> > > moment
> > > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is
> > > annoying, of course, so I try to be a good citizen and file a bug.
> > >
> > > For some reason though, I cannot get the core dumped. I get a core fine 
> > > with
> > > sysrq, but not with this actual panic. I've followed [1] to set up kdump 
> > > and
> > > crash, but everytime I trigger the crash and see my VM reboot, I see an
> > > empty /var/crash afterwards.
> > >
> > > As was able to get the vmcore written to /var/crash on in a RHEL7 guest, 
> > > I'm
> > > starting to suspect a bug, but I'm unsure.

One thing need check is if kdump service started successfully before the
crash, ie. check /sys/kernel/kexec_crash_loaded. 

If use self-build kernel, you can check to use below patch for testing:

---
It is useful to print kdump kernel loaded status in dump_stack() 
especially when panic happens so that we can  differenciate 
kdump kernel early hang and a normal panic in a bug report.

Signed-off-by: Dave Young 
---
 kernel/printk/printk.c |3 +++
 1 file changed, 3 insertions(+)

--- linux-x86.orig/kernel/printk/printk.c
+++ linux-x86/kernel/printk/printk.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l
if (dump_stack_arch_desc_str[0] != '\0')
printk("%sHardware name: %s\n",
   log_lvl, dump_stack_arch_desc_str);
+   if (kexec_crash_loaded())
+   printk("%skdump kernel loaded\n", log_lvl);
 
print_worker_info(log_lvl, current);
 }

> > >
> > > Any pointers on how to debug this?
> > >
> > > [1] 
> > > https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
> > 
> > Adding the Fedora kernel list.
> > 
> > Kdump isn't automatically tested in Fedora and while it can work, it
> > can often be broken as well.  There might be someone on the kernel
> > list that is more familiar with the current state of kdump support in
> > Fedora, or alternative methods for getting the kernel backtrace.

Yes, since Fedora kernel updates frequently, it is not a surprise that
kdump does not work.  But it is always good to report a bug against
"kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent.

> > 
> > josh
> > ___
> > kernel mailing list -- ker...@lists.fedoraproject.org
> > To unsubscribe send an email to kernel-le...@lists.fedoraproject.org

Thanks
Dave
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Can't capture vmcore?

2018-01-16 Thread Juan Orti Alcaine
2018-01-16 13:41 GMT+01:00 Josh Boyer :

> On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout  wrote:
> > I'm getting kernel panics in a VM that functions as a hypervisor, the
> moment
> > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is
> > annoying, of course, so I try to be a good citizen and file a bug.
> >
> > For some reason though, I cannot get the core dumped. I get a core fine
> with
> > sysrq, but not with this actual panic. I've followed [1] to set up kdump
> and
> > crash, but everytime I trigger the crash and see my VM reboot, I see an
> > empty /var/crash afterwards.
> >
> > As was able to get the vmcore written to /var/crash on in a RHEL7 guest,
> I'm
> > starting to suspect a bug, but I'm unsure.
> >
> > Any pointers on how to debug this?
> >
> > [1] https://fedoraproject.org/wiki/How_to_use_kdump_to_
> debug_kernel_crashes
>
> Adding the Fedora kernel list.
>
> Kdump isn't automatically tested in Fedora and while it can work, it
> can often be broken as well.  There might be someone on the kernel
> list that is more familiar with the current state of kdump support in
> Fedora, or alternative methods for getting the kernel backtrace.
>

​​I'm also interested in this, because I have a reproducible system crash
but I can't ​capture the vmcore, the crash kernel only writes the dmesg.​
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Can't capture vmcore?

2018-01-16 Thread Josh Boyer
On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout  wrote:
> I'm getting kernel panics in a VM that functions as a hypervisor, the moment
> I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is
> annoying, of course, so I try to be a good citizen and file a bug.
>
> For some reason though, I cannot get the core dumped. I get a core fine with
> sysrq, but not with this actual panic. I've followed [1] to set up kdump and
> crash, but everytime I trigger the crash and see my VM reboot, I see an
> empty /var/crash afterwards.
>
> As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm
> starting to suspect a bug, but I'm unsure.
>
> Any pointers on how to debug this?
>
> [1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes

Adding the Fedora kernel list.

Kdump isn't automatically tested in Fedora and while it can work, it
can often be broken as well.  There might be someone on the kernel
list that is more familiar with the current state of kdump support in
Fedora, or alternative methods for getting the kernel backtrace.

josh
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Can't capture vmcore?

2018-01-09 Thread Maxim Burgerhout
I'm getting kernel panics in a VM that functions as a hypervisor, the
moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That
is annoying, of course, so I try to be a good citizen and file a bug.

For some reason though, I cannot get the core dumped. I get a core fine
with sysrq, but not with this actual panic. I've followed [1] to set up
kdump and crash, but everytime I trigger the crash and see my VM reboot, I
see an empty /var/crash afterwards.

As was able to get the vmcore written to /var/crash on in a RHEL7 guest,
I'm starting to suspect a bug, but I'm unsure.

Any pointers on how to debug this?

[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org