kexec/kdump of a kvm guest?

2008-06-26 Thread Mike Snitzer
My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).

When I configure kdump in the guest (running 2.6.22.19) and force a
crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
kernel but then the kernel hangs (before it gets to /sbin/init et al).
 On the host, the associated qemu is consuming 100% cpu.

I really need to be able to collect vmcores from my kvm guests.  So
far I can't (on raw hardware all works fine).

Any pointers would be appreciated.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-05 Thread Avi Kivity

Mike Snitzer wrote:

My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).

When I configure kdump in the guest (running 2.6.22.19) and force a
crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
kernel but then the kernel hangs (before it gets to /sbin/init et al).
 On the host, the associated qemu is consuming 100% cpu.

I really need to be able to collect vmcores from my kvm guests.  So
far I can't (on raw hardware all works fine).

  


I've tested this a while ago and it worked (though I tested regular 
kexecs, not crashes); this may be a regression.


Please run kvm_stat to see what's happening at the time of the crash.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-23 Thread Mike Snitzer
On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <[EMAIL PROTECTED]> wrote:
> Mike Snitzer wrote:
>>
>> My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
>>
>> When I configure kdump in the guest (running 2.6.22.19) and force a
>> crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
>> kernel but then the kernel hangs (before it gets to /sbin/init et al).
>>  On the host, the associated qemu is consuming 100% cpu.
>>
>> I really need to be able to collect vmcores from my kvm guests.  So
>> far I can't (on raw hardware all works fine).
>>
>>
>
> I've tested this a while ago and it worked (though I tested regular kexecs,
> not crashes); this may be a regression.
>
> Please run kvm_stat to see what's happening at the time of the crash.

OK, I can look into kvm_stat but I just discovered that just having
kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
the host from being able to kexec/kdump too!?  I didn't have any
guests running (only the kvm modules were loaded).  As soon as I
unloaded the kvm modules kdump worked as expected.

Something about kvm is completely breaking kexec/kdump on both the
host and guest kernels.

Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Alexander Graf


On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:


On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <[EMAIL PROTECTED]> wrote:

Mike Snitzer wrote:


My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).

When I configure kdump in the guest (running 2.6.22.19) and force a
crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
kernel but then the kernel hangs (before it gets to /sbin/init et  
al).

On the host, the associated qemu is consuming 100% cpu.

I really need to be able to collect vmcores from my kvm guests.  So
far I can't (on raw hardware all works fine).




I've tested this a while ago and it worked (though I tested regular  
kexecs,

not crashes); this may be a regression.

Please run kvm_stat to see what's happening at the time of the crash.


OK, I can look into kvm_stat but I just discovered that just having
kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents


Is 2.6.22.19 your host or your guest kernel? It's very unlikely that  
you loaded kvm modules in the guest.



the host from being able to kexec/kdump too!?  I didn't have any
guests running (only the kvm modules were loaded).  As soon as I
unloaded the kvm modules kdump worked as expected.

Something about kvm is completely breaking kexec/kdump on both the
host and guest kernels.


I guess the kexec people would be pretty interested in this as well,  
so I'll just CC them for now.
As you're stating that the host kernel breaks with kvm modules loaded,  
maybe someone there could give a hint.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Mike Snitzer
On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:
>
> On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:
>
>> On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <[EMAIL PROTECTED]> wrote:
>>>
>>> Mike Snitzer wrote:

 My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).

 When I configure kdump in the guest (running 2.6.22.19) and force a
 crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
 kernel but then the kernel hangs (before it gets to /sbin/init et al).
 On the host, the associated qemu is consuming 100% cpu.

 I really need to be able to collect vmcores from my kvm guests.  So
 far I can't (on raw hardware all works fine).


>>>
>>> I've tested this a while ago and it worked (though I tested regular
>>> kexecs,
>>> not crashes); this may be a regression.
>>>
>>> Please run kvm_stat to see what's happening at the time of the crash.
>>
>> OK, I can look into kvm_stat but I just discovered that just having
>> kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
>
> Is 2.6.22.19 your host or your guest kernel? It's very unlikely that you
> loaded kvm modules in the guest.

Correct, 2.6.22.19 is my host kernel.

>> the host from being able to kexec/kdump too!?  I didn't have any
>> guests running (only the kvm modules were loaded).  As soon as I
>> unloaded the kvm modules kdump worked as expected.
>>
>> Something about kvm is completely breaking kexec/kdump on both the
>> host and guest kernels.
>
> I guess the kexec people would be pretty interested in this as well, so I'll
> just CC them for now.
> As you're stating that the host kernel breaks with kvm modules loaded, maybe
> someone there could give a hint.

OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
see how kexec/kdump of the host fairs when kvm modules are loaded.

On the guest side of things, as I mentioned in my original post,
kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
running 2.6.25.4 (with kvm-70).

Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Vivek Goyal
On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:
> >
> > On Jul 24, 2008, at 2:13 AM, Mike Snitzer wrote:
> >
> >> On Sat, Jul 5, 2008 at 7:20 AM, Avi Kivity <[EMAIL PROTECTED]> wrote:
> >>>
> >>> Mike Snitzer wrote:
> 
>  My host is x86_64 RHEL5U1 running 2.6.25.4 with kvm-70 (kvm-intel).
> 
>  When I configure kdump in the guest (running 2.6.22.19) and force a
>  crash (with 'echo c > /proc/sysrq-trigger) kexec boots the kdump
>  kernel but then the kernel hangs (before it gets to /sbin/init et al).
>  On the host, the associated qemu is consuming 100% cpu.
> 
>  I really need to be able to collect vmcores from my kvm guests.  So
>  far I can't (on raw hardware all works fine).
> 
> 
> >>>
> >>> I've tested this a while ago and it worked (though I tested regular
> >>> kexecs,
> >>> not crashes); this may be a regression.
> >>>
> >>> Please run kvm_stat to see what's happening at the time of the crash.
> >>
> >> OK, I can look into kvm_stat but I just discovered that just having
> >> kvm-intel and kvm loaded into my 2.6.22.19 kernel actually prevents
> >
> > Is 2.6.22.19 your host or your guest kernel? It's very unlikely that you
> > loaded kvm modules in the guest.
> 
> Correct, 2.6.22.19 is my host kernel.
> 
> >> the host from being able to kexec/kdump too!?  I didn't have any
> >> guests running (only the kvm modules were loaded).  As soon as I
> >> unloaded the kvm modules kdump worked as expected.
> >>
> >> Something about kvm is completely breaking kexec/kdump on both the
> >> host and guest kernels.
> >
> > I guess the kexec people would be pretty interested in this as well, so I'll
> > just CC them for now.
> > As you're stating that the host kernel breaks with kvm modules loaded, maybe
> > someone there could give a hint.
> 
> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> see how kexec/kdump of the host fairs when kvm modules are loaded.
> 
> On the guest side of things, as I mentioned in my original post,
> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> running 2.6.25.4 (with kvm-70).
> 

Hi Mike,

I have never tried kexec/kdump inside a kvm guest. So I don't know if
historically they have been working or not.

Having said that, Why do we need kdump to work inside the guest? In this
case qemu should be knowing about the memory of guest kernel and should
be able to capture a kernel crash dump? I am not sure if qemu already does
that. If not, then probably we should think about it?

To me, kdump is a good solution for baremetal but not for virtualized
environment where we already have another piece of software running which
can do the job for us. We will end up wasting memory in every instance
of guest (memory reserved for kdump kernel in every guest).

It will be interesting to look at your results with 2.6.25.x kernels with
kvm module inserted. Currently I can't think what can possibly be wrong.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Mike Snitzer
On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
>> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:

>> > As you're stating that the host kernel breaks with kvm modules loaded, 
>> > maybe
>> > someone there could give a hint.
>>
>> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
>> see how kexec/kdump of the host fairs when kvm modules are loaded.
>>
>> On the guest side of things, as I mentioned in my original post,
>> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
>> running 2.6.25.4 (with kvm-70).
>>
>
> Hi Mike,
>
> I have never tried kexec/kdump inside a kvm guest. So I don't know if
> historically they have been working or not.

Avi indicated he seems to remember that at least kexec worked last he
tried (didn't provide when/what he tried though).

> Having said that, Why do we need kdump to work inside the guest? In this
> case qemu should be knowing about the memory of guest kernel and should
> be able to capture a kernel crash dump? I am not sure if qemu already does
> that. If not, then probably we should think about it?
>
> To me, kdump is a good solution for baremetal but not for virtualized
> environment where we already have another piece of software running which
> can do the job for us. We will end up wasting memory in every instance
> of guest (memory reserved for kdump kernel in every guest).

I haven't looked into what mechanics qemu provides for collecting the
entire guest memory image; I'll dig deeper at some point.  It seems
the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
file for analysis) doesn't support saving a kvm guest core:
# virsh dump guest10 guest10.dump
libvir: error : this function is not supported by the hypervisor:
virDomainCoreDump
error: Failed to core dump domain guest10 to guest10.dump

Seems that libvirt functionality isn't available yet with kvm (I'm
using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
libvirt-list to get their insight.

That aside, having the crash dump collection be multi-phased really
isn't workable (that is if it requires a crashed guest to be manually
saved after the fact).  The host system _could_ be rebooted; whereby
losing the guest's core image.  So automating qemu and/or libvirtd to
trigger a dump would seem worthwhile (maybe its already done?).

So while I agree with you its ideal to not have to waste memory in
each guest for the purposes of kdump; if users want to model a guest
image as closely as possible to what will be deployed on bare metal it
really would be ideal to support a 1:1 functional equivalent with kvm.
 I work with people who refuse to use kvm because of the lack of
kexec/kdump support.

I can do further research but welcome others' insight: do others have
advice on how best to collect a crashed kvm guest's core?

> It will be interesting to look at your results with 2.6.25.x kernels with
> kvm module inserted. Currently I can't think what can possibly be wrong.

If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
loaded kexec/kdump does _not_ work (simply hangs the system).  If I
only have the kvm module loaded kexec/kdump works as expected
(likewise if no kvm modules are loaded at all).  So it would appear
that kvm-intel and kexec are definitely mutually exclusive at the
moment (at least on both 2.6.22.x and 2.6.25.x).

Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Anthony Liguori

Mike Snitzer wrote:

On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
  

On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:


On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:
  

I can do further research but welcome others' insight: do others have
advice on how best to collect a crashed kvm guest's core?
  


I don't know what you do in libvirt, but you can start a gdbstub in 
QEMU, connect with gdb, and then have gdb dump out a core.


Regards,

Anthony Liguori


It will be interesting to look at your results with 2.6.25.x kernels with
kvm module inserted. Currently I can't think what can possibly be wrong.



If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
loaded kexec/kdump does _not_ work (simply hangs the system).  If I
only have the kvm module loaded kexec/kdump works as expected
(likewise if no kvm modules are loaded at all).  So it would appear
that kvm-intel and kexec are definitely mutually exclusive at the
moment (at least on both 2.6.22.x and 2.6.25.x).

Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-24 Thread Vivek Goyal
On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:
> 
> >> > As you're stating that the host kernel breaks with kvm modules loaded, 
> >> > maybe
> >> > someone there could give a hint.
> >>
> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
> >>
> >> On the guest side of things, as I mentioned in my original post,
> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> >> running 2.6.25.4 (with kvm-70).
> >>
> >
> > Hi Mike,
> >
> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
> > historically they have been working or not.
> 
> Avi indicated he seems to remember that at least kexec worked last he
> tried (didn't provide when/what he tried though).
> 
> > Having said that, Why do we need kdump to work inside the guest? In this
> > case qemu should be knowing about the memory of guest kernel and should
> > be able to capture a kernel crash dump? I am not sure if qemu already does
> > that. If not, then probably we should think about it?
> >
> > To me, kdump is a good solution for baremetal but not for virtualized
> > environment where we already have another piece of software running which
> > can do the job for us. We will end up wasting memory in every instance
> > of guest (memory reserved for kdump kernel in every guest).
> 
> I haven't looked into what mechanics qemu provides for collecting the
> entire guest memory image; I'll dig deeper at some point.  It seems
> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
> file for analysis) doesn't support saving a kvm guest core:
> # virsh dump guest10 guest10.dump
> libvir: error : this function is not supported by the hypervisor:
> virDomainCoreDump
> error: Failed to core dump domain guest10 to guest10.dump
> 
> Seems that libvirt functionality isn't available yet with kvm (I'm
> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
> libvirt-list to get their insight.
> 
> That aside, having the crash dump collection be multi-phased really
> isn't workable (that is if it requires a crashed guest to be manually
> saved after the fact).  The host system _could_ be rebooted; whereby
> losing the guest's core image.  So automating qemu and/or libvirtd to
> trigger a dump would seem worthwhile (maybe its already done?).
> 

That's a good point. Ideally, one would like dump to be captured
automatically if kernel crashes and then reboot back to production
kernel. I am not sure what can we do to let qemu know after crash
so that it can automatically save dump.

What happens in the case of xen guests. Is dump automatically captured
or one has to force the dump capture externally.

> So while I agree with you its ideal to not have to waste memory in
> each guest for the purposes of kdump; if users want to model a guest
> image as closely as possible to what will be deployed on bare metal it
> really would be ideal to support a 1:1 functional equivalent with kvm.

Agreed. Making kdump work inside kvm guest does not harm.

>  I work with people who refuse to use kvm because of the lack of
> kexec/kdump support.
> 

Interesting.

> I can do further research but welcome others' insight: do others have
> advice on how best to collect a crashed kvm guest's core?
> 
> > It will be interesting to look at your results with 2.6.25.x kernels with
> > kvm module inserted. Currently I can't think what can possibly be wrong.
> 
> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
> only have the kvm module loaded kexec/kdump works as expected
> (likewise if no kvm modules are loaded at all).  So it would appear
> that kvm-intel and kexec are definitely mutually exclusive at the
> moment (at least on both 2.6.22.x and 2.6.25.x).

Ok. So first task is to fix host kexec/kdump with kvm-intel module
inserted.

Can you do little debugging to find out where system hangs. I generally
try few things for kexec related issue debugging.

1. Specify earlyprintk= parameter for second kernel and see if control
   is reaching to second kernel.

2. Otherwise specify --console-serial parameter on "kexec -l" commandline
   and it should display a message "I am in purgatory" on serial console.
   This will just mean that control has reached at least till purgatory.

3. If that also does not work, then most likely first kernel itself got
   stuck somewhere and we need to put some printks in first kernel to find
   out what's wrong.


Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.ht

Re: kexec/kdump of a kvm guest?

2008-07-27 Thread Avi Kivity

Vivek Goyal wrote:

Seems that libvirt functionality isn't available yet with kvm (I'm
using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
libvirt-list to get their insight.

That aside, having the crash dump collection be multi-phased really
isn't workable (that is if it requires a crashed guest to be manually
saved after the fact).  The host system _could_ be rebooted; whereby
losing the guest's core image.  So automating qemu and/or libvirtd to
trigger a dump would seem worthwhile (maybe its already done?).




That's a good point. Ideally, one would like dump to be captured
automatically if kernel crashes and then reboot back to production
kernel. I am not sure what can we do to let qemu know after crash
so that it can automatically save dump.
  


We can expose a virtual pci device that when accessed, causes qemu to 
dump the guest's core.




Ok. So first task is to fix host kexec/kdump with kvm-intel module
inserted.

Can you do little debugging to find out where system hangs. I generally
try few things for kexec related issue debugging.

1. Specify earlyprintk= parameter for second kernel and see if control
   is reaching to second kernel.

2. Otherwise specify --console-serial parameter on "kexec -l" commandline
   and it should display a message "I am in purgatory" on serial console.
   This will just mean that control has reached at least till purgatory.

3. If that also does not work, then most likely first kernel itself got
   stuck somewhere and we need to put some printks in first kernel to find
   out what's wrong.

  


kvm has a reboot notifier to turn off vmx when rebooting.  See 
kvm_reboot_notifier and kvm_reboot().  Maybe something similar is needed 
for kexec?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-07-27 Thread Avi Kivity

Mike Snitzer wrote:

Avi indicated he seems to remember that at least kexec worked last he
tried (didn't provide when/what he tried though).

  


kexec inside a guest.  Months ago.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec/kdump of a kvm guest?

2008-08-25 Thread Mike Snitzer
On Thu, Jul 24, 2008 at 9:12 PM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
>> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
>> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
>> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> wrote:
>>
>> >> > As you're stating that the host kernel breaks with kvm modules loaded, 
>> >> > maybe
>> >> > someone there could give a hint.
>> >>
>> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
>> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
>> >>
>> >> On the guest side of things, as I mentioned in my original post,
>> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
>> >> running 2.6.25.4 (with kvm-70).
>> >>
>> >
>> > Hi Mike,
>> >
>> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
>> > historically they have been working or not.
>>
>> Avi indicated he seems to remember that at least kexec worked last he
>> tried (didn't provide when/what he tried though).
>>
>> > Having said that, Why do we need kdump to work inside the guest? In this
>> > case qemu should be knowing about the memory of guest kernel and should
>> > be able to capture a kernel crash dump? I am not sure if qemu already does
>> > that. If not, then probably we should think about it?
>> >
>> > To me, kdump is a good solution for baremetal but not for virtualized
>> > environment where we already have another piece of software running which
>> > can do the job for us. We will end up wasting memory in every instance
>> > of guest (memory reserved for kdump kernel in every guest).
>>
>> I haven't looked into what mechanics qemu provides for collecting the
>> entire guest memory image; I'll dig deeper at some point.  It seems
>> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
>> file for analysis) doesn't support saving a kvm guest core:
>> # virsh dump guest10 guest10.dump
>> libvir: error : this function is not supported by the hypervisor:
>> virDomainCoreDump
>> error: Failed to core dump domain guest10 to guest10.dump
>>
>> Seems that libvirt functionality isn't available yet with kvm (I'm
>> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
>> libvirt-list to get their insight.
>>
>> That aside, having the crash dump collection be multi-phased really
>> isn't workable (that is if it requires a crashed guest to be manually
>> saved after the fact).  The host system _could_ be rebooted; whereby
>> losing the guest's core image.  So automating qemu and/or libvirtd to
>> trigger a dump would seem worthwhile (maybe its already done?).
>>
>
> That's a good point. Ideally, one would like dump to be captured
> automatically if kernel crashes and then reboot back to production
> kernel. I am not sure what can we do to let qemu know after crash
> so that it can automatically save dump.
>
> What happens in the case of xen guests. Is dump automatically captured
> or one has to force the dump capture externally.
>
>> So while I agree with you its ideal to not have to waste memory in
>> each guest for the purposes of kdump; if users want to model a guest
>> image as closely as possible to what will be deployed on bare metal it
>> really would be ideal to support a 1:1 functional equivalent with kvm.
>
> Agreed. Making kdump work inside kvm guest does not harm.
>
>>  I work with people who refuse to use kvm because of the lack of
>> kexec/kdump support.
>>
>
> Interesting.
>
>> I can do further research but welcome others' insight: do others have
>> advice on how best to collect a crashed kvm guest's core?
>>
>> > It will be interesting to look at your results with 2.6.25.x kernels with
>> > kvm module inserted. Currently I can't think what can possibly be wrong.
>>
>> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
>> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
>> only have the kvm module loaded kexec/kdump works as expected
>> (likewise if no kvm modules are loaded at all).  So it would appear
>> that kvm-intel and kexec are definitely mutually exclusive at the
>> moment (at least on both 2.6.22.x and 2.6.25.x).
>
> Ok. So first task is to fix host kexec/kdump with kvm-intel module
> inserted.
>
> Can you do little debugging to find out where system hangs. I generally
> try few things for kexec related issue debugging.
>
> 1. Specify earlyprintk= parameter for second kernel and see if control
>   is reaching to second kernel.
>
> 2. Otherwise specify --console-serial parameter on "kexec -l" commandline
>   and it should display a message "I am in purgatory" on serial console.
>   This will just mean that control has reached at least till purgatory.
>
> 3. If that also does not work, then most likely first kernel itself got
>   stuck somewhere and we need to put some printks in first kernel to find
>   out what's wrong.

Vivek,

Re: kexec/kdump of a kvm guest?

2008-08-25 Thread Vivek Goyal
On Mon, Aug 25, 2008 at 11:56:11AM -0400, Mike Snitzer wrote:
> On Thu, Jul 24, 2008 at 9:12 PM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> > On Thu, Jul 24, 2008 at 03:03:33PM -0400, Mike Snitzer wrote:
> >> On Thu, Jul 24, 2008 at 9:15 AM, Vivek Goyal <[EMAIL PROTECTED]> wrote:
> >> > On Thu, Jul 24, 2008 at 07:49:59AM -0400, Mike Snitzer wrote:
> >> >> On Thu, Jul 24, 2008 at 4:39 AM, Alexander Graf <[EMAIL PROTECTED]> 
> >> >> wrote:
> >>
> >> >> > As you're stating that the host kernel breaks with kvm modules 
> >> >> > loaded, maybe
> >> >> > someone there could give a hint.
> >> >>
> >> >> OK, I can try using a newer kernel on the host too (e.g. 2.6.25.x) to
> >> >> see how kexec/kdump of the host fairs when kvm modules are loaded.
> >> >>
> >> >> On the guest side of things, as I mentioned in my original post,
> >> >> kexec/kdump wouldn't work within a 2.6.22.19 guest with the host
> >> >> running 2.6.25.4 (with kvm-70).
> >> >>
> >> >
> >> > Hi Mike,
> >> >
> >> > I have never tried kexec/kdump inside a kvm guest. So I don't know if
> >> > historically they have been working or not.
> >>
> >> Avi indicated he seems to remember that at least kexec worked last he
> >> tried (didn't provide when/what he tried though).
> >>
> >> > Having said that, Why do we need kdump to work inside the guest? In this
> >> > case qemu should be knowing about the memory of guest kernel and should
> >> > be able to capture a kernel crash dump? I am not sure if qemu already 
> >> > does
> >> > that. If not, then probably we should think about it?
> >> >
> >> > To me, kdump is a good solution for baremetal but not for virtualized
> >> > environment where we already have another piece of software running which
> >> > can do the job for us. We will end up wasting memory in every instance
> >> > of guest (memory reserved for kdump kernel in every guest).
> >>
> >> I haven't looked into what mechanics qemu provides for collecting the
> >> entire guest memory image; I'll dig deeper at some point.  It seems
> >> the libvirt mid-layer ("virsh dump" - dump the core of a domain to a
> >> file for analysis) doesn't support saving a kvm guest core:
> >> # virsh dump guest10 guest10.dump
> >> libvir: error : this function is not supported by the hypervisor:
> >> virDomainCoreDump
> >> error: Failed to core dump domain guest10 to guest10.dump
> >>
> >> Seems that libvirt functionality isn't available yet with kvm (I'm
> >> using libvirt 0.4.2, I'll give libvirt 0.4.4 a try).  cc'ing the
> >> libvirt-list to get their insight.
> >>
> >> That aside, having the crash dump collection be multi-phased really
> >> isn't workable (that is if it requires a crashed guest to be manually
> >> saved after the fact).  The host system _could_ be rebooted; whereby
> >> losing the guest's core image.  So automating qemu and/or libvirtd to
> >> trigger a dump would seem worthwhile (maybe its already done?).
> >>
> >
> > That's a good point. Ideally, one would like dump to be captured
> > automatically if kernel crashes and then reboot back to production
> > kernel. I am not sure what can we do to let qemu know after crash
> > so that it can automatically save dump.
> >
> > What happens in the case of xen guests. Is dump automatically captured
> > or one has to force the dump capture externally.
> >
> >> So while I agree with you its ideal to not have to waste memory in
> >> each guest for the purposes of kdump; if users want to model a guest
> >> image as closely as possible to what will be deployed on bare metal it
> >> really would be ideal to support a 1:1 functional equivalent with kvm.
> >
> > Agreed. Making kdump work inside kvm guest does not harm.
> >
> >>  I work with people who refuse to use kvm because of the lack of
> >> kexec/kdump support.
> >>
> >
> > Interesting.
> >
> >> I can do further research but welcome others' insight: do others have
> >> advice on how best to collect a crashed kvm guest's core?
> >>
> >> > It will be interesting to look at your results with 2.6.25.x kernels with
> >> > kvm module inserted. Currently I can't think what can possibly be wrong.
> >>
> >> If the host's 2.6.25.4 kernel has both the kvm and kvm-intel modules
> >> loaded kexec/kdump does _not_ work (simply hangs the system).  If I
> >> only have the kvm module loaded kexec/kdump works as expected
> >> (likewise if no kvm modules are loaded at all).  So it would appear
> >> that kvm-intel and kexec are definitely mutually exclusive at the
> >> moment (at least on both 2.6.22.x and 2.6.25.x).
> >
> > Ok. So first task is to fix host kexec/kdump with kvm-intel module
> > inserted.
> >
> > Can you do little debugging to find out where system hangs. I generally
> > try few things for kexec related issue debugging.
> >
> > 1. Specify earlyprintk= parameter for second kernel and see if control
> >   is reaching to second kernel.
> >
> > 2. Otherwise specify --console-serial parameter on "kexec -l" commandline
> >   and it should display a message "I am in purga