Re: [PATCH] kvm: sync cpu state on internal error before dump
On Fri, Aug 23, 2013 at 02:41:13PM +0100, James Hogan wrote: On 23/08/13 13:58, Gleb Natapov wrote: On Fri, Aug 23, 2013 at 01:26:00PM +0100, James Hogan wrote: When a KVM internal error occurs QEMU dumps the CPU state, however it doesn't synchronise the state from KVM first so the dumped state is out of date. Add the synchronisation calls before the dump in both locations (which is used depends on whether the arch says to stop or not). x86_cpu_dump_state() calls cpu_synchronize_state() already. Ah yes, thanks. I hadn't noticed that. Out of the arches that support KVM only x86 and ppc call it. arm, mips (qemu support not upstream yet), and s390 don't. s390 never seems to emit that exit code, and arm only does so for unsupported exceptions (which should never happen). I'll fix in mips_cpu_dump_state() instead. Moving cpu_synchronize_state() up to cpu_dump_state() would be better. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm: sync cpu state on internal error before dump
Am 24.08.2013 12:37, schrieb Gleb Natapov: On Fri, Aug 23, 2013 at 02:41:13PM +0100, James Hogan wrote: On 23/08/13 13:58, Gleb Natapov wrote: On Fri, Aug 23, 2013 at 01:26:00PM +0100, James Hogan wrote: When a KVM internal error occurs QEMU dumps the CPU state, however it doesn't synchronise the state from KVM first so the dumped state is out of date. Add the synchronisation calls before the dump in both locations (which is used depends on whether the arch says to stop or not). x86_cpu_dump_state() calls cpu_synchronize_state() already. Ah yes, thanks. I hadn't noticed that. Out of the arches that support KVM only x86 and ppc call it. arm, mips (qemu support not upstream yet), and s390 don't. s390 never seems to emit that exit code, and arm only does so for unsupported exceptions (which should never happen). I'll fix in mips_cpu_dump_state() instead. Moving cpu_synchronize_state() up to cpu_dump_state() would be better. Yes, please. I did not review the hooks themselves much, just avoided global functions. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: nVMX: Fully support of nested VMX preemption timer
This patch contains the following two changes: 1. Fix the bug in nested preemption timer support. If vmexit L2-L0 with some reasons not emulated by L1, preemption timer value should be save in such exits. 2. Add support of Save VMX-preemption timer value VM-Exit controls to nVMX. With this patch, nested VMX preemption timer features are fully supported. Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- arch/x86/kvm/vmx.c | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 57b4e12..9579409 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2204,7 +2204,8 @@ static __init void nested_vmx_setup_ctls_msrs(void) #ifdef CONFIG_X86_64 VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif - VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT; + VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | + VM_EXIT_SAVE_VMX_PREEMPTION_TIMER; nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER); @@ -7578,9 +7579,14 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) (vmcs_config.pin_based_exec_ctrl | vmcs12-pin_based_vm_exec_control)); - if (vmcs12-pin_based_vm_exec_control PIN_BASED_VMX_PREEMPTION_TIMER) - vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, -vmcs12-vmx_preemption_timer_value); + if (vmcs12-pin_based_vm_exec_control PIN_BASED_VMX_PREEMPTION_TIMER) { + if (vmcs12-vm_exit_controls VM_EXIT_SAVE_VMX_PREEMPTION_TIMER) + vmcs12-vmx_preemption_timer_value = + vmcs_read32(VMX_PREEMPTION_TIMER_VALUE); + else + vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, + vmcs12-vmx_preemption_timer_value); + } /* * Whether page-faults are trapped is determined by a combination of @@ -7690,7 +7696,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER * bits are further modified by vmx_set_efer() below. */ - vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); + if (vmcs12-pin_based_vm_exec_control PIN_BASED_VMX_PREEMPTION_TIMER) + vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl | + VM_EXIT_SAVE_VMX_PREEMPTION_TIMER); + else + vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl); /* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are * emulated by vmx_set_efer(), below. @@ -7912,6 +7922,16 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) } /* +* If L2 support PIN_BASED_VMX_PREEMPTION_TIMER, L0 must support +* VM_EXIT_SAVE_VMX_PREEMPTION_TIMER. +*/ + if ((vmcs12-pin_based_vm_exec_control PIN_BASED_VMX_PREEMPTION_TIMER) + !(nested_vmx_exit_ctls_high VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)) { + nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); + return 1; + } + + /* * We're finally done with prerequisite checking, and can start with * the nested entry. */ -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- virt/kvm/kvm_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 89f74d1..d65cc0c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = { */ static int create_vcpu_fd(struct kvm_vcpu *vcpu) { - return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR); + return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR | O_CLOEXEC); } /* @@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm, return ret; } - ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR); + ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR | O_CLOEXEC); if (ret 0) { ops-destroy(dev); return ret; @@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type) return r; } #endif - r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR); + r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC); if (r 0) kvm_put_kvm(kvm); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 710d313..f7c9e8a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf) ctx-first_pass = 1; rwflag = (ghf-flags KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY; - ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag); + ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | O_CLOEXEC); if (ret 0) { kvm_put_kvm(kvm); return ret; diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index b2d3f3b..54cf9bc 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, mutex_unlock(kvm-lock); return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops, - stt, O_RDWR); + stt, O_RDWR | O_CLOEXEC); fail: if (stt) { diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index e8d51cb..3503829 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret) if (!ri) return -ENOMEM; - fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR); + fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC); if (fd 0) kvm_release_rma(ri); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
Hi, Following a patchset asking to change calls to get_unused_flag() [1] to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO to use the flag. Since it's a related subsystem to KVM, using O_CLOEXEC for file descriptors created by KVM might be applicable too. I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC as default flag. This patchset should be reviewed to not break existing userspace program. BTW, if it's not applicable, I would suggest that new ioctls be added to KVM subsystem, those ioctls would have a flag field added to their arguments. Such flag would let userspace choose the open flag to use. See for example other APIs using anon_inode_getfd() such as fanotify, inotify, signalfd and timerfd. You might be interested to read: - Secure File Descriptor Handling (Ulrich Drepper, 2008) http://udrepper.livejournal.com/20407.html - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) http://danwalsh.livejournal.com/53603.html Regards. [1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com [2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home [3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home Yann Droneaud (2): kvm: use anon_inode_getfd() with O_CLOEXEC flag ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- virt/kvm/kvm_main.c | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag
Hi, Following a patchset asking to change calls to get_unused_flag() [1] to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO to use the flag. Since it's a related subsystem to KVM, using O_CLOEXEC for file descriptors created by KVM might be applicable too. I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC as default flag. This patchset should be reviewed to not break existing userspace program. BTW, if it's not applicable, I would suggest that new ioctls be added to KVM subsystem, those ioctls would have a flag field added to their arguments. Such flag would let userspace choose the open flag to use. See for example other APIs using anon_inode_getfd() such as fanotify, inotify, signalfd and timerfd. You might be interested to read: - Secure File Descriptor Handling (Ulrich Drepper, 2008) http://udrepper.livejournal.com/20407.html - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) http://danwalsh.livejournal.com/53603.html Regards. [1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com [2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home [3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home Yann Droneaud (2): kvm: use anon_inode_getfd() with O_CLOEXEC flag ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- virt/kvm/kvm_main.c | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part of some of its ioctls. But those ioctls are lacking a flag argument allowing userspace to choose options for the newly opened file descriptor. In such case it's advised to use O_CLOEXEC by default so that userspace is allowed to choose, without race, if the file descriptor is going to be inherited across exec(). This patch set O_CLOEXEC flag on all file descriptors created with anon_inode_getfd() to not leak file descriptors across exec(). Signed-off-by: Yann Droneaud ydrone...@opteya.com Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_vio.c| 2 +- arch/powerpc/kvm/book3s_hv.c| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 710d313..f7c9e8a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf) ctx-first_pass = 1; rwflag = (ghf-flags KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY; - ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag); + ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | O_CLOEXEC); if (ret 0) { kvm_put_kvm(kvm); return ret; diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index b2d3f3b..54cf9bc 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, mutex_unlock(kvm-lock); return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops, - stt, O_RDWR); + stt, O_RDWR | O_CLOEXEC); fail: if (stt) { diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index e8d51cb..3503829 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret) if (!ri) return -ENOMEM; - fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR); + fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC); if (fd 0) kvm_release_rma(ri); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html