Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
Hi, Avi, Yanfei I'm YOSHIDA Masanori from Hitachi, a developer of livedump. And here is a new case from the LinuxCon Japan: Developers from Hitach are now developing a new livedump mechanism for the same reason as ours. They have come to the situation *many times* that guest machines crashed due to host's failures, in particular, under development. This has happened to me as well, possible even more times . I don't use crash dumps for debugging but different people may use different techniques. As Yanfei's introduction, I'm developing livedump for cases where guests crash due to host's failures. Especially in very important systems, it is strongly requested to identify the root cause of any failure even if it is very rare. For this purpose, crash dumps must be obtained. Therefore, we think livedump technique must be applied to the virtualization system on that kind of area. After the buggy situation is reproduced, we panic the host *manually*. Then we could use userland tools to get guest machine's crash dump from host machine's with the feature provided by this patch set. Finally we could analyse them separately to find which side causes the problem. Could you please tell me your attitude towards this patch? I still dislike it conceptually. But let me do a technical review of the latest version. Actually, current implementation of livedump is just a core part, which dumps only the image of kernel space. And I'd like to expand it to obtain guest image at the same time too. Also for this situation, VMCSINFO seems necessary to be exported. Thanks, YOSHIDA Masanori Yokohama Research Laboratory, Hitachi, Ltd. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On 06/11/2012 08:35 AM, Yanfei Zhang wrote: Hello Avi, Sorry about the delay... 于 2012年05月29日 15:06, Yanfei Zhang 写道: 于 2012年05月28日 21:28, Avi Kivity 写道: On 05/28/2012 08:25 AM, Yanfei Zhang wrote: Dou you have any comments about this patch set? I still have a hard time understanding why it is needed. If the host crashes, there is no reason to look at guest state; the host should survive no matter what the guest does. OK. Let me summarize it. 1. Why is this patch needed? (Our requirement) We once came to a buggy situation: a host scheduler bug caused guest machine's vcpu stopped for a long time and then led to heartbeat stop (host is still running). we want to have an efficient way to make the bug analysis when we come to the similar situation where guest machine doesn't work well due to something of host machine's, Because we should debug both host machine's and guest machine's sides to look for the reasons, so we want to get both host machine's crash dump and guest machine's crash dump at the same time when the buggy situation remains. I would argue that there are two separate bugs here: (1) a host bug which caused the scheduling delay (2) putting a heartbeat service on a virtualized guests with no real time guarantees. But I understand your situation. 2. What will we do? If this bug was found on customer's environment, we have two ways to avoid affecting other guest machines running on the same host. First, we could do bug analysis on another environment to reproduce the buggy situation; Second, we could migrate other guest machines to other hosts. You could also use tracing (there's the latency tracer and the scheduler tracepoints) to debug this on a live system. After the buggy situation is reproduced, we panic the host *manually*. Then we could use userland tools to get guest machine's crash dump from host machine's with the feature provided by this patch set. Finally we could analyse them separately to find which side causes the problem. Could you please tell me your attitude towards this patch? I still dislike it conceptually. But let me do a technical review of the latest version. And here is a new case from the LinuxCon Japan: Developers from Hitach are now developing a new livedump mechanism for the same reason as ours. They have come to the situation *many times* that guest machines crashed due to host's failures, in particular, under development. This has happened to me as well, possible even more times :). I don't use crash dumps for debugging but different people may use different techniques. So they develop this mechanism to get crash dump while retaining the buggy situation between host and guest machine. The difference between theirs and ours is whether or not to use the feature on _customer's running machine_. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
Hello Avi, 于 2012年05月29日 15:06, Yanfei Zhang 写道: 于 2012年05月28日 21:28, Avi Kivity 写道: On 05/28/2012 08:25 AM, Yanfei Zhang wrote: Dou you have any comments about this patch set? I still have a hard time understanding why it is needed. If the host crashes, there is no reason to look at guest state; the host should survive no matter what the guest does. OK. Let me summarize it. 1. Why is this patch needed? (Our requirement) We once came to a buggy situation: a host scheduler bug caused guest machine's vcpu stopped for a long time and then led to heartbeat stop (host is still running). we want to have an efficient way to make the bug analysis when we come to the similar situation where guest machine doesn't work well due to something of host machine's, Because we should debug both host machine's and guest machine's sides to look for the reasons, so we want to get both host machine's crash dump and guest machine's crash dump at the same time when the buggy situation remains. 2. What will we do? If this bug was found on customer's environment, we have two ways to avoid affecting other guest machines running on the same host. First, we could do bug analysis on another environment to reproduce the buggy situation; Second, we could migrate other guest machines to other hosts. After the buggy situation is reproduced, we panic the host *manually*. Then we could use userland tools to get guest machine's crash dump from host machine's with the feature provided by this patch set. Finally we could analyse them separately to find which side causes the problem. Could you please tell me your attitude towards this patch? And here is a new case from the LinuxCon Japan: Developers from Hitach are now developing a new livedump mechanism for the same reason as ours. They have come to the situation *many times* that guest machines crashed due to host's failures, in particular, under development. So they develop this mechanism to get crash dump while retaining the buggy situation between host and guest machine. The difference between theirs and ours is whether or not to use the feature on _customer's running machine_. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
于 2012年05月28日 21:28, Avi Kivity 写道: On 05/28/2012 08:25 AM, Yanfei Zhang wrote: Dou you have any comments about this patch set? I still have a hard time understanding why it is needed. If the host crashes, there is no reason to look at guest state; the host should survive no matter what the guest does. OK. Let me summarize it. 1. Why is this patch needed? (Our requirement) We once came to a buggy situation: a host scheduler bug caused guest machine's vcpu stopped for a long time and then led to heartbeat stop (host is still running). we want to have an efficient way to make the bug analysis when we come to the similar situation where guest machine doesn't work well due to something of host machine's, Because we should debug both host machine's and guest machine's sides to look for the reasons, so we want to get both host machine's crash dump and guest machine's crash dump at the same time when the buggy situation remains. 2. What will we do? If this bug was found on customer's environment, we have two ways to avoid affecting other guest machines running on the same host. First, we could do bug analysis on another environment to reproduce the buggy situation; Second, we could migrate other guest machines to other hosts. After the buggy situation is reproduced, we panic the host *manually*. Then we could use userland tools to get guest machine's crash dump from host machine's with the feature provided by this patch set. Finally we could analyse them separately to find which side causes the problem. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On 05/28/2012 08:25 AM, Yanfei Zhang wrote: Dou you have any comments about this patch set? I still have a hard time understanding why it is needed. If the host crashes, there is no reason to look at guest state; the host should survive no matter what the guest does. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
Hello Avi, 于 2012年05月22日 11:40, Yanfei Zhang 写道: 于 2012年05月21日 17:36, Avi Kivity 写道: On 05/21/2012 12:08 PM, Yanfei Zhang wrote: 于 2012年05月21日 16:34, Avi Kivity 写道: On 05/21/2012 05:32 AM, Yanfei Zhang wrote: 于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. How can a guest bug cause a host panic? So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. If the guest caused the problem, there would be no panic; therefore there was a host bug. Yes, a guest bug cannot cause a host panic. When heartbeat stops in guest machine, we could trigger the host dump mechanism to work. This is because we want to get the status of both host and guest machine at the same time when heartbeat stops in guest machine. Then we can look for bug reasons from both host machine's and guest machine's views. That sounds like a bad idea. Can you explain in what situation it makes sense for a guest to stop the host (and all other guests running on it) rather than just restarting the failed services (on the host or other guests)? We never do this on customer's environment which maybe a host with many guests running on it. We do this on another environment to reproduce the buggy situation; or we do this in testing phase on development environment towards production one on the customer's site. Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Seems to me the VMCS offsets are OS independent. Hmm, you mean we could get VMCS offsets in sadump itself? But I think if we just export VMCS offsets in kernel, we could use the current existing dump tools with no or just very tiny change. I think this could be a more general mechanism than making changes in all kinds of dump tools. The sadump tool generates a core file with the OS image, right? Can it not attach the offsets to a note, just like you propose for kdump? Both
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On Mon, May 21, 2012 at 8:53 PM, Yanfei Zhang zhangyan...@cn.fujitsu.com wrote: 于 2012年05月22日 02:58, Eric Northup 写道: [...] So you can have the VMCS offset dumping be a manually-loaded module. Build a database mapping from (CPUID, microcode revision) - (VMCSINFO). There's no need for anything beyond the (CPUID, microcode revision) to be put in the kdump, since your offline processing of a kdump can then look up the rest. [...] We have considered this way, but there are two issues: 1) vmx resource is unique for a single cpu, and it's risky to grab it forcibly on the environment where kvm module is used, in particular on customer's environment. To do this safely, kvm support is needed. It's not risky: you just have to make sure that no one else is going to use the VMCS on your CPU while you're running. You can disable preemption and then save the old VMCS pointer from the CPU (see the VMPTRST instructions). Load your temporary VMCS pointer, discover the fields, then restore the original VMCS pointer. Then re-enable preemption and you're done. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On 05/21/2012 05:32 AM, Yanfei Zhang wrote: 于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. How can a guest bug cause a host panic? So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. If the guest caused the problem, there would be no panic; therefore there was a host bug. Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Seems to me the VMCS offsets are OS independent. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
于 2012年05月21日 16:34, Avi Kivity 写道: On 05/21/2012 05:32 AM, Yanfei Zhang wrote: 于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. How can a guest bug cause a host panic? So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. If the guest caused the problem, there would be no panic; therefore there was a host bug. Yes, a guest bug cannot cause a host panic. When heartbeat stops in guest machine, we could trigger the host dump mechanism to work. This is because we want to get the status of both host and guest machine at the same time when heartbeat stops in guest machine. Then we can look for bug reasons from both host machine's and guest machine's views. Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Seems to me the VMCS offsets are OS independent. Hmm, you mean we could get VMCS offsets in sadump itself? But I think if we just export VMCS offsets in kernel, we could use the current existing dump tools with no or just very tiny change. I think this could be a more general mechanism than making changes in all kinds of dump tools. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On 05/21/2012 12:08 PM, Yanfei Zhang wrote: 于 2012年05月21日 16:34, Avi Kivity 写道: On 05/21/2012 05:32 AM, Yanfei Zhang wrote: 于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. How can a guest bug cause a host panic? So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. If the guest caused the problem, there would be no panic; therefore there was a host bug. Yes, a guest bug cannot cause a host panic. When heartbeat stops in guest machine, we could trigger the host dump mechanism to work. This is because we want to get the status of both host and guest machine at the same time when heartbeat stops in guest machine. Then we can look for bug reasons from both host machine's and guest machine's views. That sounds like a bad idea. Can you explain in what situation it makes sense for a guest to stop the host (and all other guests running on it) rather than just restarting the failed services (on the host or other guests)? Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Seems to me the VMCS offsets are OS independent. Hmm, you mean we could get VMCS offsets in sadump itself? But I think if we just export VMCS offsets in kernel, we could use the current existing dump tools with no or just very tiny change. I think this could be a more general mechanism than making changes in all kinds of dump tools. The sadump tool generates a core file with the OS image, right? Can it not attach the offsets to a note, just like you propose for kdump? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
On Wed, May 16, 2012 at 12:50 AM, zhangyanfei zhangyan...@cn.fujitsu.com wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Perhaps I'm wrong, but this solution seems much, much more dynamic than it needs to be. The VMCS offsets aren't going to change between different boots on the same CPU, unless perhaps the microcode has been updated. So you can have the VMCS offset dumping be a manually-loaded module. Build a database mapping from (CPUID, microcode revision) - (VMCSINFO). There's no need for anything beyond the (CPUID, microcode revision) to be put in the kdump, since your offline processing of a kdump can then look up the rest. It means you don't have to interact with the vmx module at all, and no extra modules or code have to be loaded on the millions of Linux machines that won't need the functionality. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
于 2012年05月22日 02:58, Eric Northup 写道: On Wed, May 16, 2012 at 12:50 AM, zhangyanfei zhangyan...@cn.fujitsu.com wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Perhaps I'm wrong, but this solution seems much, much more dynamic than it needs to be. The VMCS offsets aren't going to change between different boots on the same CPU, unless perhaps the microcode has been updated. So you can have the VMCS offset dumping be a manually-loaded module. Build a database mapping from (CPUID, microcode revision) - (VMCSINFO). There's no need for anything beyond the (CPUID, microcode revision) to be put in the kdump, since your offline processing of a kdump can then look up the rest. It means you don't have to interact with the vmx module at all, and no extra modules or code have to be loaded on the millions of Linux machines that won't need the functionality. We have considered this way, but there are two issues: 1) vmx resource is unique for a single cpu, and it's risky to grab it forcibly on the environment where kvm module is used, in particular on customer's environment. To do this safely, kvm support is needed. 2) It highly costs to prepare each cpu to each customer environment to collect vmcsinfo. After all, there are various environments on our customer's. Our patch provides a module, so those who doesn't want this feature can just stop it being auto-loaded when system starts up. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
于 2012年05月21日 17:36, Avi Kivity 写道: On 05/21/2012 12:08 PM, Yanfei Zhang wrote: 于 2012年05月21日 16:34, Avi Kivity 写道: On 05/21/2012 05:32 AM, Yanfei Zhang wrote: 于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. How can a guest bug cause a host panic? So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. If the guest caused the problem, there would be no panic; therefore there was a host bug. Yes, a guest bug cannot cause a host panic. When heartbeat stops in guest machine, we could trigger the host dump mechanism to work. This is because we want to get the status of both host and guest machine at the same time when heartbeat stops in guest machine. Then we can look for bug reasons from both host machine's and guest machine's views. That sounds like a bad idea. Can you explain in what situation it makes sense for a guest to stop the host (and all other guests running on it) rather than just restarting the failed services (on the host or other guests)? We never do this on customer's environment which maybe a host with many guests running on it. We do this on another environment to reproduce the buggy situation; or we do this in testing phase on development environment towards production one on the customer's site. Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Seems to me the VMCS offsets are OS independent. Hmm, you mean we could get VMCS offsets in sadump itself? But I think if we just export VMCS offsets in kernel, we could use the current existing dump tools with no or just very tiny change. I think this could be a more general mechanism than making changes in all kinds of dump tools. The sadump tool generates a core file with the OS image, right? Can it not attach the offsets to a note, just like you propose for kdump? Both are right. -- To unsubscribe from this list: send the line
Re: [PATCH v2 0/5] Export offsets of VMCS fields as note information for kdump
于 2012年05月21日 01:43, Avi Kivity 写道: On 05/16/2012 10:50 AM, zhangyanfei wrote: This patch set exports offsets of VMCS fields as note information for kdump. We call it VMCSINFO. The purpose of VMCSINFO is to retrieve runtime state of guest machine image, such as registers, in host machine's crash dump as VMCS format. The problem is that VMCS internal is hidden by Intel in its specification. So, we slove this problem by reverse engineering implemented in this patch set. The VMCSINFO is exported via sysfs to kexec-tools just like VMCOREINFO. Here are two usercases for two features that we want. 1) Create guest machine's crash dumpfile from host machine's crash dumpfile In general, we want to use this feature on failure analysis for the system where the processing depends on the communication between host and guest machines to look into the system from both machines's viewpoints. As a concrete situation, consider where there's heartbeat monitoring feature on the guest machine's side, where we need to determine in which machine side the cause of heartbeat stop lies. In our actual experiments, we encountered such situation and we found the cause of the bug was in host's process schedular so guest machine's vcpu stopped for a long time and then led to heartbeat stop. The module that judges heartbeat stop is on guest machine, so we need to debug guest machine's data. But if the cause lies in host machine side, we need to look into host machine's crash dump. Do you mean, that a heartbeat failure in the guest lead to host panic? My expectation is that a problem in the guest will cause the guest to panic and perhaps produce a dump; the host will remain up. The point is that before our investigation, we didn't know which side leads to this buggy situation. Maybe a bug in host machine or the guest machine itself causes a heartbeat failure. So we want to get both host machine's crash dump and guest machine's crash dump *at the same time*. Then we could use userspace tools to get guest machine crash dump from host machine's and analyse them separately to find which side causes the problem. Without this feature, we first create guest machine's dump and then create host mahine's, but there's only a short time between two processings, during which it's unlikely that buggy situation remains. So, we think the feature is useful to debug both guest machine's and host machine's sides at the same time, and expect we can make failure analysis efficiently. Of course, we believe this feature is commonly useful on the situation where guest machine doesn't work well due to something of host machine's. 2) Get offsets of VMCS information on the CPU running on the host machine If kdump doesn't work well, then it means we cannot use kvm API to get register values of guest machine and they are still left on its vmcs region. In the case, we use crash dump mechanism running outside of linux kernel, such as sadump, a firmware-based crash dump. Then VMCS information is then necessary. Shouldn't sadump then expose the VMCS offsets? Perhaps bundling them into its dump file? Firmware-based crash dump doesn't concern the os running on the machine. So it will not do any os handling when machine crashes. Thanks Zhang Yanfei -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html