Re: Can't capture vmcore?
On 01/17/18 at 09:31am, Dave Young wrote: > Don, thanks for ccing me. > On 01/16/18 at 07:47am, Don Zickus wrote: > > (cc'ing Dae Young) > > > > On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote: > > > On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout wrote: > > > > I'm getting kernel panics in a VM that functions as a hypervisor, the > > > > moment > > > > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is > > > > annoying, of course, so I try to be a good citizen and file a bug. > > > > > > > > For some reason though, I cannot get the core dumped. I get a core fine > > > > with > > > > sysrq, but not with this actual panic. I've followed [1] to set up > > > > kdump and > > > > crash, but everytime I trigger the crash and see my VM reboot, I see an > > > > empty /var/crash afterwards. > > > > > > > > As was able to get the vmcore written to /var/crash on in a RHEL7 > > > > guest, I'm > > > > starting to suspect a bug, but I'm unsure. > > One thing need check is if kdump service started successfully before the > crash, ie. check /sys/kernel/kexec_crash_loaded. > > If use self-build kernel, you can check to use below patch for testing: > > --- > It is useful to print kdump kernel loaded status in dump_stack() > especially when panic happens so that we can differenciate > kdump kernel early hang and a normal panic in a bug report. > > Signed-off-by: Dave Young > --- > kernel/printk/printk.c |3 +++ > 1 file changed, 3 insertions(+) > > --- linux-x86.orig/kernel/printk/printk.c > +++ linux-x86/kernel/printk/printk.c > @@ -48,6 +48,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l > if (dump_stack_arch_desc_str[0] != '\0') > printk("%sHardware name: %s\n", > log_lvl, dump_stack_arch_desc_str); > + if (kexec_crash_loaded()) > + printk("%skdump kernel loaded\n", log_lvl); > > print_worker_info(log_lvl, current); > } > > > > > > > > > Any pointers on how to debug this? > > > > > > > > [1] > > > > https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes > > > > > > Adding the Fedora kernel list. > > > > > > Kdump isn't automatically tested in Fedora and while it can work, it > > > can often be broken as well. There might be someone on the kernel > > > list that is more familiar with the current state of kdump support in > > > Fedora, or alternative methods for getting the kernel backtrace. > > Yes, since Fedora kernel updates frequently, it is not a surprise that > kdump does not work. But it is always good to report a bug against > "kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent. Hmm, I noticed in bugzilla there is no such subcomponent for Fedora if so the kdump bugs can be routed to "kexec-tools" so that we can be aware about them. > > > > > > > josh > > > ___ > > > kernel mailing list -- ker...@lists.fedoraproject.org > > > To unsubscribe send an email to kernel-le...@lists.fedoraproject.org > > Thanks > Dave ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Can't capture vmcore?
Don, thanks for ccing me. On 01/16/18 at 07:47am, Don Zickus wrote: > (cc'ing Dae Young) > > On Tue, Jan 16, 2018 at 07:41:36AM -0500, Josh Boyer wrote: > > On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout wrote: > > > I'm getting kernel panics in a VM that functions as a hypervisor, the > > > moment > > > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is > > > annoying, of course, so I try to be a good citizen and file a bug. > > > > > > For some reason though, I cannot get the core dumped. I get a core fine > > > with > > > sysrq, but not with this actual panic. I've followed [1] to set up kdump > > > and > > > crash, but everytime I trigger the crash and see my VM reboot, I see an > > > empty /var/crash afterwards. > > > > > > As was able to get the vmcore written to /var/crash on in a RHEL7 guest, > > > I'm > > > starting to suspect a bug, but I'm unsure. One thing need check is if kdump service started successfully before the crash, ie. check /sys/kernel/kexec_crash_loaded. If use self-build kernel, you can check to use below patch for testing: --- It is useful to print kdump kernel loaded status in dump_stack() especially when panic happens so that we can differenciate kdump kernel early hang and a normal panic in a bug report. Signed-off-by: Dave Young --- kernel/printk/printk.c |3 +++ 1 file changed, 3 insertions(+) --- linux-x86.orig/kernel/printk/printk.c +++ linux-x86/kernel/printk/printk.c @@ -48,6 +48,7 @@ #include #include #include +#include #include #include @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l if (dump_stack_arch_desc_str[0] != '\0') printk("%sHardware name: %s\n", log_lvl, dump_stack_arch_desc_str); + if (kexec_crash_loaded()) + printk("%skdump kernel loaded\n", log_lvl); print_worker_info(log_lvl, current); } > > > > > > Any pointers on how to debug this? > > > > > > [1] > > > https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes > > > > Adding the Fedora kernel list. > > > > Kdump isn't automatically tested in Fedora and while it can work, it > > can often be broken as well. There might be someone on the kernel > > list that is more familiar with the current state of kdump support in > > Fedora, or alternative methods for getting the kernel backtrace. Yes, since Fedora kernel updates frequently, it is not a surprise that kdump does not work. But it is always good to report a bug against "kexec-tools" component or "kernel" -> "Kexec/kdump" Subcomponent. > > > > josh > > ___ > > kernel mailing list -- ker...@lists.fedoraproject.org > > To unsubscribe send an email to kernel-le...@lists.fedoraproject.org Thanks Dave ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Can't capture vmcore?
2018-01-16 13:41 GMT+01:00 Josh Boyer : > On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout wrote: > > I'm getting kernel panics in a VM that functions as a hypervisor, the > moment > > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is > > annoying, of course, so I try to be a good citizen and file a bug. > > > > For some reason though, I cannot get the core dumped. I get a core fine > with > > sysrq, but not with this actual panic. I've followed [1] to set up kdump > and > > crash, but everytime I trigger the crash and see my VM reboot, I see an > > empty /var/crash afterwards. > > > > As was able to get the vmcore written to /var/crash on in a RHEL7 guest, > I'm > > starting to suspect a bug, but I'm unsure. > > > > Any pointers on how to debug this? > > > > [1] https://fedoraproject.org/wiki/How_to_use_kdump_to_ > debug_kernel_crashes > > Adding the Fedora kernel list. > > Kdump isn't automatically tested in Fedora and while it can work, it > can often be broken as well. There might be someone on the kernel > list that is more familiar with the current state of kdump support in > Fedora, or alternative methods for getting the kernel backtrace. > I'm also interested in this, because I have a reproducible system crash but I can't capture the vmcore, the crash kernel only writes the dmesg. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: Can't capture vmcore?
On Tue, Jan 9, 2018 at 1:51 PM, Maxim Burgerhout wrote: > I'm getting kernel panics in a VM that functions as a hypervisor, the moment > I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is > annoying, of course, so I try to be a good citizen and file a bug. > > For some reason though, I cannot get the core dumped. I get a core fine with > sysrq, but not with this actual panic. I've followed [1] to set up kdump and > crash, but everytime I trigger the crash and see my VM reboot, I see an > empty /var/crash afterwards. > > As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm > starting to suspect a bug, but I'm unsure. > > Any pointers on how to debug this? > > [1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes Adding the Fedora kernel list. Kdump isn't automatically tested in Fedora and while it can work, it can often be broken as well. There might be someone on the kernel list that is more familiar with the current state of kdump support in Fedora, or alternative methods for getting the kernel backtrace. josh ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Can't capture vmcore?
I'm getting kernel panics in a VM that functions as a hypervisor, the moment I spin up the nested guest (on AMD ThreadRipper / Fedora 27). That is annoying, of course, so I try to be a good citizen and file a bug. For some reason though, I cannot get the core dumped. I get a core fine with sysrq, but not with this actual panic. I've followed [1] to set up kdump and crash, but everytime I trigger the crash and see my VM reboot, I see an empty /var/crash afterwards. As was able to get the vmcore written to /var/crash on in a RHEL7 guest, I'm starting to suspect a bug, but I'm unsure. Any pointers on how to debug this? [1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org