On 12/23/2010 04:28 PM, Marcelo Tosatti wrote:
On Wed, Dec 22, 2010 at 10:52:51AM +0800, Huang Ying wrote:
> In Linux kernel HWPoison processing implementation, the virtual
> address in processes mapping the error physical memory page is marked
> as HWPoison. So that, the further accessing to the virtual
> address will kill corresponding processes with SIGBUS.
>
> If the error physical memory page is used by a KVM guest, the SIGBUS
> will be sent to QEMU, and QEMU will simulate a MCE to report that
> memory error to the guest OS. If the guest OS can not recover from
> the error (for example, the page is accessed by kernel code), guest OS
> will reboot the system. But because the underlying host virtual
> address backing the guest physical memory is still poisoned, if the
> guest system accesses the corresponding guest physical memory even
> after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
> simulated. That is, guest system can not recover via rebooting.
>
> In fact, across rebooting, the contents of guest physical memory page
> need not to be kept. We can allocate a new host physical page to
> back the corresponding guest physical address.
>
> This patch fixes this issue in QEMU-KVM via invoke the unpoison
> mechanism implemented in Linux kernel to clear the corresponding page
> table entry, so that make it possible to allocate a new page to
> recover the issue.
>
> Signed-off-by: Huang Ying<ying.hu...@intel.com>
> +struct HWPoisonPage;
> +typedef struct HWPoisonPage HWPoisonPage;
> +struct HWPoisonPage
> +{
> + void *vaddr;
> + QLIST_ENTRY(HWPoisonPage) list;
> +};
> +
> +static QLIST_HEAD(hwpoison_page_list, HWPoisonPage) hwpoison_page_list =
> + QLIST_HEAD_INITIALIZER(hwpoison_page_list);
> +
> +static void kvm_unpoison_all(void *param)
> +{
> + HWPoisonPage *page, *next_page;
> + unsigned long address;
> + KVMState *s = param;
> +
> + QLIST_FOREACH_SAFE(page,&hwpoison_page_list, list, next_page) {
> + address = (unsigned long)page->vaddr;
> + QLIST_REMOVE(page, list);
> + kvm_vm_ioctl(s, KVM_UNPOISON_ADDRESS, address);
> + qemu_free(page);
> + }
> +}
Can't you free and reallocate all guest memory instead, on reboot, if
there's a hwpoisoned page? Then you don't need this interface.
Alternatively, MADV_DONTNEED? We already use it for ballooning.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html