Re: [PATCH] kvm: sync cpu state on internal error before dump

2013-08-24 Thread Gleb Natapov
On Fri, Aug 23, 2013 at 02:41:13PM +0100, James Hogan wrote:
 On 23/08/13 13:58, Gleb Natapov wrote:
  On Fri, Aug 23, 2013 at 01:26:00PM +0100, James Hogan wrote:
  When a KVM internal error occurs QEMU dumps the CPU state, however it
  doesn't synchronise the state from KVM first so the dumped state is out
  of date. Add the synchronisation calls before the dump in both locations
  (which is used depends on whether the arch says to stop or not).
 
  x86_cpu_dump_state() calls cpu_synchronize_state() already.
 
 Ah yes, thanks. I hadn't noticed that.
 
 Out of the arches that support KVM only x86 and ppc call it. arm, mips
 (qemu support not upstream yet), and s390 don't. s390 never seems to
 emit that exit code, and arm only does so for unsupported exceptions
 (which should never happen).
 
 I'll fix in mips_cpu_dump_state() instead.
 
Moving cpu_synchronize_state() up to cpu_dump_state() would be better.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] kvm: sync cpu state on internal error before dump

2013-08-24 Thread Andreas Färber
Am 24.08.2013 12:37, schrieb Gleb Natapov:
 On Fri, Aug 23, 2013 at 02:41:13PM +0100, James Hogan wrote:
 On 23/08/13 13:58, Gleb Natapov wrote:
 On Fri, Aug 23, 2013 at 01:26:00PM +0100, James Hogan wrote:
 When a KVM internal error occurs QEMU dumps the CPU state, however it
 doesn't synchronise the state from KVM first so the dumped state is out
 of date. Add the synchronisation calls before the dump in both locations
 (which is used depends on whether the arch says to stop or not).

 x86_cpu_dump_state() calls cpu_synchronize_state() already.

 Ah yes, thanks. I hadn't noticed that.

 Out of the arches that support KVM only x86 and ppc call it. arm, mips
 (qemu support not upstream yet), and s390 don't. s390 never seems to
 emit that exit code, and arm only does so for unsupported exceptions
 (which should never happen).

 I'll fix in mips_cpu_dump_state() instead.

 Moving cpu_synchronize_state() up to cpu_dump_state() would be better.

Yes, please. I did not review the hooks themselves much, just avoided
global functions.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: nVMX: Fully support of nested VMX preemption timer

2013-08-24 Thread root
This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2-L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of Save VMX-preemption timer value VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 arch/x86/kvm/vmx.c |   30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 57b4e12..9579409 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2204,7 +2204,8 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 #ifdef CONFIG_X86_64
VM_EXIT_HOST_ADDR_SPACE_SIZE |
 #endif
-   VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT;
+   VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
+   VM_EXIT_SAVE_VMX_PREEMPTION_TIMER;
nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
  VM_EXIT_LOAD_IA32_EFER);
 
@@ -7578,9 +7579,14 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
(vmcs_config.pin_based_exec_ctrl |
 vmcs12-pin_based_vm_exec_control));
 
-   if (vmcs12-pin_based_vm_exec_control  PIN_BASED_VMX_PREEMPTION_TIMER)
-   vmcs_write32(VMX_PREEMPTION_TIMER_VALUE,
-vmcs12-vmx_preemption_timer_value);
+   if (vmcs12-pin_based_vm_exec_control  PIN_BASED_VMX_PREEMPTION_TIMER) 
{
+   if (vmcs12-vm_exit_controls  
VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
+   vmcs12-vmx_preemption_timer_value =
+   vmcs_read32(VMX_PREEMPTION_TIMER_VALUE);
+   else
+   vmcs_write32(VMX_PREEMPTION_TIMER_VALUE,
+   vmcs12-vmx_preemption_timer_value);
+   }
 
/*
 * Whether page-faults are trapped is determined by a combination of
@@ -7690,7 +7696,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
 * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER
 * bits are further modified by vmx_set_efer() below.
 */
-   vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
+   if (vmcs12-pin_based_vm_exec_control  PIN_BASED_VMX_PREEMPTION_TIMER)
+   vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl |
+   VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+   else
+   vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
 
/* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are
 * emulated by vmx_set_efer(), below.
@@ -7912,6 +7922,16 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool 
launch)
}
 
/*
+* If L2 support PIN_BASED_VMX_PREEMPTION_TIMER, L0 must support
+* VM_EXIT_SAVE_VMX_PREEMPTION_TIMER.
+*/
+   if ((vmcs12-pin_based_vm_exec_control  
PIN_BASED_VMX_PREEMPTION_TIMER) 
+   !(nested_vmx_exit_ctls_high  
VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)) {
+   nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+   return 1;
+   }
+
+   /*
 * We're finally done with prerequisite checking, and can start with
 * the nested entry.
 */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com
---
 virt/kvm/kvm_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 89f74d1..d65cc0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = {
  */
 static int create_vcpu_fd(struct kvm_vcpu *vcpu)
 {
-   return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR);
+   return anon_inode_getfd(kvm-vcpu, kvm_vcpu_fops, vcpu, O_RDWR | 
O_CLOEXEC);
 }
 
 /*
@@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
return ret;
}
 
-   ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR);
+   ret = anon_inode_getfd(ops-name, kvm_device_fops, dev, O_RDWR | 
O_CLOEXEC);
if (ret  0) {
ops-destroy(dev);
return ret;
@@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
return r;
}
 #endif
-   r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR);
+   r = anon_inode_getfd(kvm-vm, kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC);
if (r  0)
kvm_put_kvm(kvm);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com

---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
ctx-first_pass = 1;
 
rwflag = (ghf-flags  KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
-   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag);
+   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | 
O_CLOEXEC);
if (ret  0) {
kvm_put_kvm(kvm);
return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
mutex_unlock(kvm-lock);
 
return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops,
-   stt, O_RDWR);
+   stt, O_RDWR | O_CLOEXEC);
 
 fail:
if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct 
kvm_allocate_rma *ret)
if (!ri)
return -ENOMEM;
 
-   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR);
+   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
if (fd  0)
kvm_release_rma(ri);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
Hi,

Following a patchset asking to change calls to get_unused_flag() [1]
to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
to use the flag.

Since it's a related subsystem to KVM, using O_CLOEXEC for
file descriptors created by KVM might be applicable too.

I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
as default flag.

This patchset should be reviewed to not break existing userspace program.

BTW, if it's not applicable, I would suggest that new ioctls be added to
KVM subsystem, those ioctls would have a flag field added to their arguments.
Such flag would let userspace choose the open flag to use.
See for example other APIs using anon_inode_getfd() such as fanotify,
inotify, signalfd and timerfd.

You might be interested to read:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
  http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) 
  http://danwalsh.livejournal.com/53603.html

Regards.

[1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com
[2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home
[3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home

Yann Droneaud (2):
  kvm: use anon_inode_getfd() with O_CLOEXEC flag
  ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 virt/kvm/kvm_main.c | 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
Hi,

Following a patchset asking to change calls to get_unused_flag() [1]
to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
to use the flag.

Since it's a related subsystem to KVM, using O_CLOEXEC for
file descriptors created by KVM might be applicable too.

I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
as default flag.

This patchset should be reviewed to not break existing userspace program.

BTW, if it's not applicable, I would suggest that new ioctls be added to
KVM subsystem, those ioctls would have a flag field added to their arguments.
Such flag would let userspace choose the open flag to use.
See for example other APIs using anon_inode_getfd() such as fanotify,
inotify, signalfd and timerfd.

You might be interested to read:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
  http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) 
  http://danwalsh.livejournal.com/53603.html

Regards.

[1] http://lkml.kernel.org/r/cover.1376327678.git.ydrone...@opteya.com
[2] http://lkml.kernel.org/r/1377186804.25163.17.ca...@ul30vt.home
[3] http://lkml.kernel.org/r/20130822171744.1297.13711.st...@bling.home

Yann Droneaud (2):
  kvm: use anon_inode_getfd() with O_CLOEXEC flag
  ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 virt/kvm/kvm_main.c | 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

2013-08-24 Thread Yann Droneaud
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud ydrone...@opteya.com
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydrone...@opteya.com

---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 2 +-
 arch/powerpc/kvm/book3s_hv.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
ctx-first_pass = 1;
 
rwflag = (ghf-flags  KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
-   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag);
+   ret = anon_inode_getfd(kvm-htab, kvm_htab_fops, ctx, rwflag | 
O_CLOEXEC);
if (ret  0) {
kvm_put_kvm(kvm);
return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
mutex_unlock(kvm-lock);
 
return anon_inode_getfd(kvm-spapr-tce, kvm_spapr_tce_fops,
-   stt, O_RDWR);
+   stt, O_RDWR | O_CLOEXEC);
 
 fail:
if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct 
kvm_allocate_rma *ret)
if (!ri)
return -ENOMEM;
 
-   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR);
+   fd = anon_inode_getfd(kvm-rma, kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
if (fd  0)
kvm_release_rma(ri);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html