date:20150407

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Jan Kiszka

On 2015-04-07 08:10, Valentine Sinitsyn wrote:
 Hi Jan,
 
 On 07.04.2015 10:43, Jan Kiszka wrote:
 On 2015-04-05 19:12, Valentine Sinitsyn wrote:
 Hi Jan,

 On 05.04.2015 13:31, Jan Kiszka wrote:
 studying the VM exit logic of Jailhouse, I was wondering when AMD's
 vmload/vmsave can be avoided. Jailhouse as well as KVM currently use
 these instructions unconditionally. However, I think both only need
 GS.base, i.e. the per-cpu base address, to be saved and restored if no
 user space exit or no CPU migration is involved (both is always true
 for
 Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it also
 still uses rsp-based per-cpu variables.

 So the question boils down to what is generally faster:

 A) vmload
  vmrun
  vmsave

 B) wrmsrl(MSR_GS_BASE, guest_gs_base)
  vmrun
  rdmsrl(MSR_GS_BASE, guest_gs_base)

 Of course, KVM also has to take into account that heavyweight exits
 still require vmload/vmsave, thus become more expensive with B) due to
 the additional MSR accesses.

 Any thoughts or results of previous experiments?
 That's a good question, I also thought about it when I was finalizing
 Jailhouse AMD port. I tried lightweight exits with apic-demo but it
 didn't seem to affect the latency in any noticeable way. That's why I
 decided not to push the patch (in fact, I was even unable to find it
 now).

 Note however that how AMD chips store host state during VM switches are
 implementation-specific. I did my quick experiments on one CPU only, so
 your mileage may vary.

 Regarding your question, I feel B will be faster anyways but again I'm
 afraid that the gain could be within statistical error of the
 experiment.

 It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
 towards 600 if they are colder (added some usleep to each loop in the
 test).
 Great, thanks. Could you post absolute numbers, i.e how long do A and B
 take on your CPU?

A is around 1910 cycles, B about 1750.

Jan

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Jan Kiszka

On 2015-04-07 08:19, Valentine Sinitsyn wrote:
 On 07.04.2015 11:13, Jan Kiszka wrote:
 It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
 towards 600 if they are colder (added some usleep to each loop in the
 test).
 Great, thanks. Could you post absolute numbers, i.e how long do A and B
 take on your CPU?

 A is around 1910 cycles, B about 1750.
 It's with hot caches I guess? Not bad anyways, it's a pity I didn't
 observe this and didn't include this optimization from the day one.

Yes, that is with the unmodified benchmark I sent. When I add, say
usleep(1000) to that loop body, the cycles jumped to 4k (IIRC).

BTW, this is the Jailhouse patch:
https://github.com/siemens/jailhouse/commit/dbf2fe479ac07a677462dfa87e008e37a4e72858

Jan

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Jan Kiszka

On 2015-04-07 08:29, Valentine Sinitsyn wrote:
 On 07.04.2015 11:23, Jan Kiszka wrote:
 On 2015-04-07 08:19, Valentine Sinitsyn wrote:
 On 07.04.2015 11:13, Jan Kiszka wrote:
 It is, at least 160 cycles with hot caches on an AMD A6-5200 APU,
 more
 towards 600 if they are colder (added some usleep to each loop in the
 test).
 Great, thanks. Could you post absolute numbers, i.e how long do A
 and B
 take on your CPU?

 A is around 1910 cycles, B about 1750.
 It's with hot caches I guess? Not bad anyways, it's a pity I didn't
 observe this and didn't include this optimization from the day one.

 Yes, that is with the unmodified benchmark I sent. When I add, say
 usleep(1000) to that loop body, the cycles jumped to 4k (IIRC).

 BTW, this is the Jailhouse patch:
 https://github.com/siemens/jailhouse/commit/dbf2fe479ac07a677462dfa87e008e37a4e72858

 I guess, it's getting off-topic here, but wouldn't it be cleaner to
 simply use wrmsr and rdmsr instead of vmload and vmsave in svm-vmexit.S?
 This would require less changes and will keep all entry/exit setup code
 in one place.

It's a tradeoff between assembly lines and C statements. My feeling is
that it's easier done in C, but you can prove me wrong.

Jan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Valentine Sinitsyn


Hi Jan,

On 07.04.2015 10:43, Jan Kiszka wrote:

On 2015-04-05 19:12, Valentine Sinitsyn wrote:

Hi Jan,

On 05.04.2015 13:31, Jan Kiszka wrote:

studying the VM exit logic of Jailhouse, I was wondering when AMD's
vmload/vmsave can be avoided. Jailhouse as well as KVM currently use
these instructions unconditionally. However, I think both only need
GS.base, i.e. the per-cpu base address, to be saved and restored if no
user space exit or no CPU migration is involved (both is always true for
Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it also
still uses rsp-based per-cpu variables.

So the question boils down to what is generally faster:

A) vmload
 vmrun
 vmsave

B) wrmsrl(MSR_GS_BASE, guest_gs_base)
 vmrun
 rdmsrl(MSR_GS_BASE, guest_gs_base)

Of course, KVM also has to take into account that heavyweight exits
still require vmload/vmsave, thus become more expensive with B) due to
the additional MSR accesses.

Any thoughts or results of previous experiments?

That's a good question, I also thought about it when I was finalizing
Jailhouse AMD port. I tried lightweight exits with apic-demo but it
didn't seem to affect the latency in any noticeable way. That's why I
decided not to push the patch (in fact, I was even unable to find it now).

Note however that how AMD chips store host state during VM switches are
implementation-specific. I did my quick experiments on one CPU only, so
your mileage may vary.

Regarding your question, I feel B will be faster anyways but again I'm
afraid that the gain could be within statistical error of the experiment.


It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
towards 600 if they are colder (added some usleep to each loop in the test).
Great, thanks. Could you post absolute numbers, i.e how long do A and B 
take on your CPU?


Valentine
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Valentine Sinitsyn


On 07.04.2015 11:13, Jan Kiszka wrote:

It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
towards 600 if they are colder (added some usleep to each loop in the
test).

Great, thanks. Could you post absolute numbers, i.e how long do A and B
take on your CPU?


A is around 1910 cycles, B about 1750.
It's with hot caches I guess? Not bad anyways, it's a pity I didn't 
observe this and didn't include this optimization from the day one.


Valentine
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SVM: vmload/vmsave-free VM exits?

2015-04-07 Thread Valentine Sinitsyn


On 07.04.2015 11:23, Jan Kiszka wrote:

On 2015-04-07 08:19, Valentine Sinitsyn wrote:

On 07.04.2015 11:13, Jan Kiszka wrote:

It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
towards 600 if they are colder (added some usleep to each loop in the
test).

Great, thanks. Could you post absolute numbers, i.e how long do A and B
take on your CPU?


A is around 1910 cycles, B about 1750.

It's with hot caches I guess? Not bad anyways, it's a pity I didn't
observe this and didn't include this optimization from the day one.


Yes, that is with the unmodified benchmark I sent. When I add, say
usleep(1000) to that loop body, the cycles jumped to 4k (IIRC).

BTW, this is the Jailhouse patch:
https://github.com/siemens/jailhouse/commit/dbf2fe479ac07a677462dfa87e008e37a4e72858
I guess, it's getting off-topic here, but wouldn't it be cleaner to 
simply use wrmsr and rdmsr instead of vmload and vmsave in svm-vmexit.S? 
This would require less changes and will keep all entry/exit setup code 
in one place.


Valentine
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: vmx: pass error code with internal error #2

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 21:11, Radim Krčmář wrote:
 Exposing the on-stack error code with internal error is cheap and
 potentially useful.
 
 Signed-off-by: Radim Krčmář rkrc...@redhat.com
 ---
  arch/x86/kvm/vmx.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 0caaf56eb459..cfbd737afcd1 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -5089,9 +5089,10 @@ static int handle_exception(struct kvm_vcpu *vcpu)
   !(is_page_fault(intr_info)  !(error_code  PFERR_RSVD_MASK))) {
   vcpu-run-exit_reason = KVM_EXIT_INTERNAL_ERROR;
   vcpu-run-internal.suberror = KVM_INTERNAL_ERROR_SIMUL_EX;
 - vcpu-run-internal.ndata = 2;
 + vcpu-run-internal.ndata = 3;
   vcpu-run-internal.data[0] = vect_info;
   vcpu-run-internal.data[1] = intr_info;
 + vcpu-run-internal.data[2] = error_code;
   return 0;
   }
  
 

Applied, thanks.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] KVM: x86: optimization: cache physical address width to avoid excessive enumerations of CPUID entries

2015-04-07 Thread Paolo Bonzini



On 29/03/2015 22:56, Eugene Korenevsky wrote:
 cpuid_maxphyaddr() which performs lot of memory accesses is called extensively
 across KVM, especially in nVMX code.
 This patch adds cached value of maxphyaddr to vcpu.arch to reduce the 
 pressure onto
 CPU cache and simplify the code of cpuid_maxphyaddr() callers. The cached 
 value is
 initialized in kvm_arch_vcpu_init() and reloaded every time CPUID is updated 
 by
 usermode. It is obvious that these reloads occur infrequently.
 
 Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com
 ---
  arch/x86/include/asm/kvm_host.h |  4 +++-
  arch/x86/kvm/cpuid.c| 33 ++---
  arch/x86/kvm/cpuid.h|  6 ++
  arch/x86/kvm/x86.c  |  2 ++
  4 files changed, 29 insertions(+), 16 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index a236e39..2362a60 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -431,6 +431,9 @@ struct kvm_vcpu_arch {
  
   int cpuid_nent;
   struct kvm_cpuid_entry2 cpuid_entries[KVM_MAX_CPUID_ENTRIES];
 +
 + int maxphyaddr;
 +
   /* emulate context */
  
   struct x86_emulate_ctxt emulate_ctxt;
 @@ -1128,7 +1131,6 @@ int kvm_unmap_hva_range(struct kvm *kvm, unsigned long 
 start, unsigned long end)
  int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
  int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
  void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
 -int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
  int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
  int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
  int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
 index 8a80737..59b69f6 100644
 --- a/arch/x86/kvm/cpuid.c
 +++ b/arch/x86/kvm/cpuid.c
 @@ -104,6 +104,9 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
   ((best-eax  0xff00)  8) != 0)
   return -EINVAL;
  
 + /* Update physical-address width */
 + vcpu-arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
 +
   kvm_pmu_cpuid_update(vcpu);
   return 0;
  }
 @@ -135,6 +138,21 @@ static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
   }
  }
  
 +int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu)
 +{
 + struct kvm_cpuid_entry2 *best;
 +
 + best = kvm_find_cpuid_entry(vcpu, 0x8000, 0);
 + if (!best || best-eax  0x8008)
 + goto not_found;
 + best = kvm_find_cpuid_entry(vcpu, 0x8008, 0);
 + if (best)
 + return best-eax  0xff;
 +not_found:
 + return 36;
 +}
 +EXPORT_SYMBOL_GPL(cpuid_query_maxphyaddr);
 +
  /* when an old userspace process fills a new kernel module */
  int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
struct kvm_cpuid *cpuid,
 @@ -757,21 +775,6 @@ struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct 
 kvm_vcpu *vcpu,
  }
  EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry);
  
 -int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
 -{
 - struct kvm_cpuid_entry2 *best;
 -
 - best = kvm_find_cpuid_entry(vcpu, 0x8000, 0);
 - if (!best || best-eax  0x8008)
 - goto not_found;
 - best = kvm_find_cpuid_entry(vcpu, 0x8008, 0);
 - if (best)
 - return best-eax  0xff;
 -not_found:
 - return 36;
 -}
 -EXPORT_SYMBOL_GPL(cpuid_maxphyaddr);
 -
  /*
   * If no match is found, check whether we exceed the vCPU's limit
   * and return the content of the highest valid _standard_ leaf instead.
 diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
 index 4452eed..78b61b4 100644
 --- a/arch/x86/kvm/cpuid.h
 +++ b/arch/x86/kvm/cpuid.h
 @@ -20,6 +20,12 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
 struct kvm_cpuid_entry2 __user *entries);
  void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx, u32 *ecx, u32 
 *edx);
  
 +int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu);
 +
 +static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
 +{
 + return vcpu-arch.maxphyaddr;
 +}
  
  static inline bool guest_cpuid_has_xsave(struct kvm_vcpu *vcpu)
  {
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index bd7a70b..084e1d5 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -7289,6 +7289,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
   vcpu-arch.guest_supported_xcr0 = 0;
   vcpu-arch.guest_xstate_size = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET;
  
 + vcpu-arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
 +
   kvm_async_pf_hash_reset(vcpu);
   kvm_pmu_init(vcpu);
  
 

Applied series, thanks.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] x86: vdso: fix pvclock races with task migration

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 20:44, Radim Krčmář wrote:
 If we were migrated right after __getcpu, but before reading the
 migration_count, we wouldn't notice that we read TSC of a different
 VCPU, nor that KVM's bug made pvti invalid, as only migration_count
 on source VCPU is increased.
 
 Change vdso instead of updating migration_count on destination.
 
 Fixes: 0a4e6be9ca17 (x86: kvm: Revert remove sched notifier for cross-cpu 
 migrations)
 Cc: sta...@vger.kernel.org
 Signed-off-by: Radim Krčmář rkrc...@redhat.com

Applying this, but removing the Fixes tag because a guest patch cannot
fix a host patch (it can work around it or complement it).

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 0/7] vhost: support for cross endian guests

2015-04-07 Thread Greg Kurz

Hi,

This patchset allows vhost to be used with legacy virtio when guest and host
have a different endianness.

Patches 1-6 remain the same as the previous post. Patch 7 was heavily changed
according to MST's comments.

---

Greg Kurz (7):
  virtio: introduce virtio_is_little_endian() helper
  tun: add tun_is_little_endian() helper
  macvtap: introduce macvtap_is_little_endian() helper
  vringh: introduce vringh_is_little_endian() helper
  vhost: introduce vhost_is_little_endian() helper
  virtio: add explicit big-endian support to memory accessors
  vhost: feature to set the vring endianness


 drivers/net/macvtap.c|   11 ++--
 drivers/net/tun.c|   11 ++--
 drivers/vhost/Kconfig|   10 +++
 drivers/vhost/vhost.c|   55 ++
 drivers/vhost/vhost.h|   34 +++
 include/linux/virtio_byteorder.h |   24 ++---
 include/linux/virtio_config.h|   19 +
 include/linux/vringh.h   |   19 +
 include/uapi/linux/vhost.h   |5 +++
 9 files changed, 156 insertions(+), 32 deletions(-)

--
Greg

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 3/7] macvtap: introduce macvtap_is_little_endian() helper

2015-04-07 Thread Greg Kurz

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 drivers/net/macvtap.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 27ecc5c..a2f2958 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -49,14 +49,19 @@ struct macvtap_queue {
 
 #define MACVTAP_VNET_LE 0x8000
 
+static inline bool macvtap_is_little_endian(struct macvtap_queue *q)
+{
+   return q-flags  MACVTAP_VNET_LE;
+}
+
 static inline u16 macvtap16_to_cpu(struct macvtap_queue *q, __virtio16 val)
 {
-   return __virtio16_to_cpu(q-flags  MACVTAP_VNET_LE, val);
+   return __virtio16_to_cpu(macvtap_is_little_endian(q), val);
 }
 
 static inline __virtio16 cpu_to_macvtap16(struct macvtap_queue *q, u16 val)
 {
-   return __cpu_to_virtio16(q-flags  MACVTAP_VNET_LE, val);
+   return __cpu_to_virtio16(macvtap_is_little_endian(q), val);
 }
 
 static struct proto macvtap_proto = {

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 2/7] tun: add tun_is_little_endian() helper

2015-04-07 Thread Greg Kurz

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 drivers/net/tun.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 857dca4..3c3d6c0 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -206,14 +206,19 @@ struct tun_struct {
u32 flow_count;
 };
 
+static inline bool tun_is_little_endian(struct tun_struct *tun)
+{
+   return tun-flags  TUN_VNET_LE;
+}
+
 static inline u16 tun16_to_cpu(struct tun_struct *tun, __virtio16 val)
 {
-   return __virtio16_to_cpu(tun-flags  TUN_VNET_LE, val);
+   return __virtio16_to_cpu(tun_is_little_endian(tun), val);
 }
 
 static inline __virtio16 cpu_to_tun16(struct tun_struct *tun, u16 val)
 {
-   return __cpu_to_virtio16(tun-flags  TUN_VNET_LE, val);
+   return __cpu_to_virtio16(tun_is_little_endian(tun), val);
 }
 
 static inline u32 tun_hashfn(u32 rxhash)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 1/7] virtio: introduce virtio_is_little_endian() helper

2015-04-07 Thread Greg Kurz

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 include/linux/virtio_config.h |   17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index ca3ed78..bd1a582 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -205,35 +205,40 @@ int virtqueue_set_affinity(struct virtqueue *vq, int cpu)
return 0;
 }
 
+static inline bool virtio_is_little_endian(struct virtio_device *vdev)
+{
+   return virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
+}
+
 /* Memory accessors */
 static inline u16 virtio16_to_cpu(struct virtio_device *vdev, __virtio16 val)
 {
-   return __virtio16_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __virtio16_to_cpu(virtio_is_little_endian(vdev), val);
 }
 
 static inline __virtio16 cpu_to_virtio16(struct virtio_device *vdev, u16 val)
 {
-   return __cpu_to_virtio16(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio16(virtio_is_little_endian(vdev), val);
 }
 
 static inline u32 virtio32_to_cpu(struct virtio_device *vdev, __virtio32 val)
 {
-   return __virtio32_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __virtio32_to_cpu(virtio_is_little_endian(vdev), val);
 }
 
 static inline __virtio32 cpu_to_virtio32(struct virtio_device *vdev, u32 val)
 {
-   return __cpu_to_virtio32(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio32(virtio_is_little_endian(vdev), val);
 }
 
 static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val)
 {
-   return __virtio64_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __virtio64_to_cpu(virtio_is_little_endian(vdev), val);
 }
 
 static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val)
 {
-   return __cpu_to_virtio64(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio64(virtio_is_little_endian(vdev), val);
 }
 
 /* Config space accessors. */

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 6/7] virtio: add explicit big-endian support to memory accessors

2015-04-07 Thread Greg Kurz

The current memory accessors logic is:
- little endian if little_endian
- native endian (i.e. no byteswap) if !little_endian

If we want to fully support cross-endian vhost, we also need to be
able to convert to big endian.

Instead of changing the little_endian argument to some 3-value enum, this
patch changes the logic to:
- little endian if little_endian
- big endian if !little_endian

The native endian case is handled by all users with a trivial helper. This
patch doesn't change any functionality, nor it does add overhead.

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 drivers/net/macvtap.c|4 +++-
 drivers/net/tun.c|4 +++-
 drivers/vhost/vhost.h|4 +++-
 include/linux/virtio_byteorder.h |   24 ++--
 include/linux/virtio_config.h|4 +++-
 include/linux/vringh.h   |4 +++-
 6 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index a2f2958..0a03a66 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -51,7 +51,9 @@ struct macvtap_queue {
 
 static inline bool macvtap_is_little_endian(struct macvtap_queue *q)
 {
-   return q-flags  MACVTAP_VNET_LE;
+   if (q-flags  MACVTAP_VNET_LE)
+   return true;
+   return virtio_legacy_is_little_endian();
 }
 
 static inline u16 macvtap16_to_cpu(struct macvtap_queue *q, __virtio16 val)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3c3d6c0..053f9b6 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -208,7 +208,9 @@ struct tun_struct {
 
 static inline bool tun_is_little_endian(struct tun_struct *tun)
 {
-   return tun-flags  TUN_VNET_LE;
+   if (tun-flags  TUN_VNET_LE)
+   return true;
+   return virtio_legacy_is_little_endian();
 }
 
 static inline u16 tun16_to_cpu(struct tun_struct *tun, __virtio16 val)
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 6a49960..4e9a186 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -175,7 +175,9 @@ static inline bool vhost_has_feature(struct vhost_virtqueue 
*vq, int bit)
 
 static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq)
 {
-   return vhost_has_feature(vq, VIRTIO_F_VERSION_1);
+   if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
+   return true;
+   return virtio_legacy_is_little_endian();
 }
 
 /* Memory accessors */
diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
index 51865d0..ce63a2c 100644
--- a/include/linux/virtio_byteorder.h
+++ b/include/linux/virtio_byteorder.h
@@ -3,17 +3,21 @@
 #include linux/types.h
 #include uapi/linux/virtio_types.h
 
-/*
- * Low-level memory accessors for handling virtio in modern little endian and 
in
- * compatibility native endian format.
- */
+static inline bool virtio_legacy_is_little_endian(void)
+{
+#ifdef __LITTLE_ENDIAN
+   return true;
+#else
+   return false;
+#endif
+}
 
 static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
 {
if (little_endian)
return le16_to_cpu((__force __le16)val);
else
-   return (__force u16)val;
+   return be16_to_cpu((__force __be16)val);
 }
 
 static inline __virtio16 __cpu_to_virtio16(bool little_endian, u16 val)
@@ -21,7 +25,7 @@ static inline __virtio16 __cpu_to_virtio16(bool 
little_endian, u16 val)
if (little_endian)
return (__force __virtio16)cpu_to_le16(val);
else
-   return (__force __virtio16)val;
+   return (__force __virtio16)cpu_to_be16(val);
 }
 
 static inline u32 __virtio32_to_cpu(bool little_endian, __virtio32 val)
@@ -29,7 +33,7 @@ static inline u32 __virtio32_to_cpu(bool little_endian, 
__virtio32 val)
if (little_endian)
return le32_to_cpu((__force __le32)val);
else
-   return (__force u32)val;
+   return be32_to_cpu((__force __be32)val);
 }
 
 static inline __virtio32 __cpu_to_virtio32(bool little_endian, u32 val)
@@ -37,7 +41,7 @@ static inline __virtio32 __cpu_to_virtio32(bool 
little_endian, u32 val)
if (little_endian)
return (__force __virtio32)cpu_to_le32(val);
else
-   return (__force __virtio32)val;
+   return (__force __virtio32)cpu_to_be32(val);
 }
 
 static inline u64 __virtio64_to_cpu(bool little_endian, __virtio64 val)
@@ -45,7 +49,7 @@ static inline u64 __virtio64_to_cpu(bool little_endian, 
__virtio64 val)
if (little_endian)
return le64_to_cpu((__force __le64)val);
else
-   return (__force u64)val;
+   return be64_to_cpu((__force __be64)val);
 }
 
 static inline __virtio64 __cpu_to_virtio64(bool little_endian, u64 val)
@@ -53,7 +57,7 @@ static inline __virtio64 __cpu_to_virtio64(bool 
little_endian, u64 val)
if (little_endian)
return (__force

[PATCH v3 5/7] vhost: introduce vhost_is_little_endian() helper

2015-04-07 Thread Greg Kurz

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 drivers/vhost/vhost.h |   17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 8c1c792..6a49960 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -173,34 +173,39 @@ static inline bool vhost_has_feature(struct 
vhost_virtqueue *vq, int bit)
return vq-acked_features  (1ULL  bit);
 }
 
+static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq)
+{
+   return vhost_has_feature(vq, VIRTIO_F_VERSION_1);
+}
+
 /* Memory accessors */
 static inline u16 vhost16_to_cpu(struct vhost_virtqueue *vq, __virtio16 val)
 {
-   return __virtio16_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __virtio16_to_cpu(vhost_is_little_endian(vq), val);
 }
 
 static inline __virtio16 cpu_to_vhost16(struct vhost_virtqueue *vq, u16 val)
 {
-   return __cpu_to_virtio16(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio16(vhost_is_little_endian(vq), val);
 }
 
 static inline u32 vhost32_to_cpu(struct vhost_virtqueue *vq, __virtio32 val)
 {
-   return __virtio32_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __virtio32_to_cpu(vhost_is_little_endian(vq), val);
 }
 
 static inline __virtio32 cpu_to_vhost32(struct vhost_virtqueue *vq, u32 val)
 {
-   return __cpu_to_virtio32(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio32(vhost_is_little_endian(vq), val);
 }
 
 static inline u64 vhost64_to_cpu(struct vhost_virtqueue *vq, __virtio64 val)
 {
-   return __virtio64_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __virtio64_to_cpu(vhost_is_little_endian(vq), val);
 }
 
 static inline __virtio64 cpu_to_vhost64(struct vhost_virtqueue *vq, u64 val)
 {
-   return __cpu_to_virtio64(vhost_has_feature(vq, VIRTIO_F_VERSION_1), 
val);
+   return __cpu_to_virtio64(vhost_is_little_endian(vq), val);
 }
 #endif

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 7/7] vhost: feature to set the vring endianness

2015-04-07 Thread Greg Kurz

This patch brings cross-endian support to vhost when used to implement
legacy virtio devices. Since it is a relatively rare situation, the
feature availability is controlled by a kernel config option (not set
by default).

The ioctls introduced by this patch are for legacy only: virtio 1.0
devices are returned EPERM.

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 drivers/vhost/Kconfig  |   10 
 drivers/vhost/vhost.c  |   55 
 drivers/vhost/vhost.h  |   17 +-
 include/uapi/linux/vhost.h |5 
 4 files changed, 86 insertions(+), 1 deletion(-)

Changes since v2:
- fixed typos in Kconfig description
- renamed vq-legacy_big_endian to vq-legacy_is_little_endian
- vq-legacy_is_little_endian reset to default in vhost_vq_reset()
- dropped VHOST_F_SET_ENDIAN_LEGACY feature
- dropped struct vhost_vring_endian from the user API (re-use
  struct vhost_vring_state instead)
- added VHOST_GET_VRING_ENDIAN_LEGACY ioctl
- introduced more helpers and stubs to avoid polluting the code with ifdefs


diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 017a1e8..0aec88c 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -32,3 +32,13 @@ config VHOST
---help---
  This option is selected by any driver which needs to access
  the core of vhost.
+
+config VHOST_SET_ENDIAN_LEGACY
+   bool Cross-endian support for host kernel accelerator
+   default n
+   ---help---
+ This option allows vhost to support guests with a different byte
+ ordering from host. It is disabled by default since it adds overhead
+ and it is only needed by a few platforms (powerpc and arm).
+
+ If unsure, say N.
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ee2826..3529a3c 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -199,6 +199,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq-call = NULL;
vq-log_ctx = NULL;
vq-memory = NULL;
+   vq-legacy_is_little_endian = virtio_legacy_is_little_endian();
 }
 
 static int vhost_worker(void *data)
@@ -630,6 +631,54 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
vhost_memory __user *m)
return 0;
 }
 
+#ifdef CONFIG_VHOST_SET_ENDIAN_LEGACY
+static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
+ void __user *argp)
+{
+   struct vhost_vring_state s;
+
+   if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
+   return -EPERM;
+
+   if (copy_from_user(s, argp, sizeof(s)))
+   return -EFAULT;
+
+   vq-legacy_is_little_endian = !!s.num;
+   return 0;
+}
+
+static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
+ u32 idx,
+ void __user *argp)
+{
+   struct vhost_vring_state s = {
+   .index = idx,
+   .num = vq-legacy_is_little_endian
+   };
+
+   if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
+   return -EPERM;
+
+   if (copy_to_user(argp, s, sizeof(s)))
+   return -EFAULT;
+
+   return 0;
+}
+#else
+static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
+ void __user *argp)
+{
+   return 0;
+}
+
+static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
+ u32 idx,
+ void __user *argp)
+{
+   return 0;
+}
+#endif /* CONFIG_VHOST_SET_ENDIAN_LEGACY */
+
 long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
 {
struct file *eventfp, *filep = NULL;
@@ -806,6 +855,12 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, 
void __user *argp)
} else
filep = eventfp;
break;
+   case VHOST_SET_VRING_ENDIAN_LEGACY:
+   r = vhost_set_vring_endian_legacy(vq, argp);
+   break;
+   case VHOST_GET_VRING_ENDIAN_LEGACY:
+   r = vhost_get_vring_endian_legacy(vq, idx, argp);
+   break;
default:
r = -ENOIOCTLCMD;
}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 4e9a186..981ba06 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -106,6 +106,9 @@ struct vhost_virtqueue {
/* Log write descriptors */
void __user *log_base;
struct vhost_log *log;
+
+   /* We need to know the device endianness with legacy virtio. */
+   bool legacy_is_little_endian;
 };
 
 struct vhost_dev {
@@ -173,11 +176,23 @@ static inline bool vhost_has_feature(struct 
vhost_virtqueue *vq, int bit)
return vq-acked_features  (1ULL  bit);
 }
 
+#ifdef CONFIG_VHOST_SET_ENDIAN_LEGACY
+static inline bool vhost_legacy_is_little_endian(struct vhost_virtqueue *vq)
+{

[PATCH v3 4/7] vringh: introduce vringh_is_little_endian() helper

2015-04-07 Thread Greg Kurz

Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
---
 include/linux/vringh.h |   17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index a3fa537..3ed62ef 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -226,33 +226,38 @@ static inline void vringh_notify(struct vringh *vrh)
vrh-notify(vrh);
 }
 
+static inline bool vringh_is_little_endian(const struct vringh *vrh)
+{
+   return vrh-little_endian;
+}
+
 static inline u16 vringh16_to_cpu(const struct vringh *vrh, __virtio16 val)
 {
-   return __virtio16_to_cpu(vrh-little_endian, val);
+   return __virtio16_to_cpu(vringh_is_little_endian(vrh), val);
 }
 
 static inline __virtio16 cpu_to_vringh16(const struct vringh *vrh, u16 val)
 {
-   return __cpu_to_virtio16(vrh-little_endian, val);
+   return __cpu_to_virtio16(vringh_is_little_endian(vrh), val);
 }
 
 static inline u32 vringh32_to_cpu(const struct vringh *vrh, __virtio32 val)
 {
-   return __virtio32_to_cpu(vrh-little_endian, val);
+   return __virtio32_to_cpu(vringh_is_little_endian(vrh), val);
 }
 
 static inline __virtio32 cpu_to_vringh32(const struct vringh *vrh, u32 val)
 {
-   return __cpu_to_virtio32(vrh-little_endian, val);
+   return __cpu_to_virtio32(vringh_is_little_endian(vrh), val);
 }
 
 static inline u64 vringh64_to_cpu(const struct vringh *vrh, __virtio64 val)
 {
-   return __virtio64_to_cpu(vrh-little_endian, val);
+   return __virtio64_to_cpu(vringh_is_little_endian(vrh), val);
 }
 
 static inline __virtio64 cpu_to_vringh64(const struct vringh *vrh, u64 val)
 {
-   return __cpu_to_virtio64(vrh-little_endian, val);
+   return __cpu_to_virtio64(vringh_is_little_endian(vrh), val);
 }
 #endif /* _LINUX_VRINGH_H */

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 4/4] KVM: x86: simplify kvm_apic_map

2015-04-07 Thread Paolo Bonzini



On 12/02/2015 19:41, Radim Krčmář wrote:
 +static inline void
 +apic_logical_id(struct kvm_apic_map *map, u32 dest_id, u16 *cid, u16 *lid)
 +{
 + BUILD_BUG_ON(KVM_APIC_MODE_XAPIC_CLUSTER !=  4);
 + BUILD_BUG_ON(KVM_APIC_MODE_XAPIC_FLAT!=  8);
 + BUILD_BUG_ON(KVM_APIC_MODE_X2APIC!= 16);
 +

I added

unsigned lid_bits = map-mode;

here (used in the rest of apic_logical_id) and applied the series.

Thanks,

Paolo

 + *cid = dest_id  map-mode;
 + *lid = dest_id  ((1  map-mode) - 1);
 +}
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] x86: vdso: fix pvclock races with task migration

2015-04-07 Thread Radim Krčmář

2015-04-07 13:11+0200, Paolo Bonzini:
 On 02/04/2015 20:44, Radim Krčmář wrote:
  If we were migrated right after __getcpu, but before reading the
  migration_count, we wouldn't notice that we read TSC of a different
  VCPU, nor that KVM's bug made pvti invalid, as only migration_count
  on source VCPU is increased.
  
  Change vdso instead of updating migration_count on destination.
  
  Fixes: 0a4e6be9ca17 (x86: kvm: Revert remove sched notifier for cross-cpu 
  migrations)
  Cc: sta...@vger.kernel.org
  Signed-off-by: Radim Krčmář rkrc...@redhat.com
 
 Applying this, but removing the Fixes tag because a guest patch cannot
 fix a host patch (it can work around it or complement it).

I think it was correct.  Both are guest only, the revert just missed
some races.  (0a4e6be9ca17 has misleading commit message ...)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()

2015-04-07 Thread Paolo Bonzini



On 03/03/2015 21:42, Radim Krčmář wrote:
 2015-03-03 13:48-0600, Joel Schopp:
 +  unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
 Shouldn't we handle writes in EAX differently than in AX and AL, because
 of implicit zero extension.
 I don't think the implicit zero extension hurts us here, but maybe there
 is something I'm missing that I need understand. Could you explain this
 further?
 
 According to APM vol.2, 2.5.3 Operands and Results, when using EAX,
 we should zero upper 32 bits of RAX:
 
   Zero Extension of Results. In 64-bit mode, when performing 32-bit
   operations with a GPR destination, the processor zero-extends the 32-bit
   result into the full 64-bit destination. Both 8-bit and 16-bit
   operations on GPRs preserve all unwritten upper bits of the destination
   GPR. This is consistent with legacy 16-bit and 32-bit semantics for
   partial-width results.
 
 Is IN not covered?

It is.  You need to zero the upper 32 bits.

 +  BUG_ON(!vcpu-arch.pio.count);
 +  BUG_ON(vcpu-arch.pio.count * vcpu-arch.pio.size  sizeof(new_rax));
 (Looking at it again, a check for 'vcpu-arch.pio.count == 1' would be
  sufficient.)
 I prefer the checks that are there now after your last review,
 especially since surrounded by BUG_ON they only run on debug kernels.
 
 BUG_ON is checked on essentially all kernels that run KVM.
 (All distribution-based configs should have it.)

Correct.

 If we wanted to validate the size, then this is strictly better:
   BUG_ON(vcpu-arch.pio.count != 1 || vcpu-arch.pio.size  sizeof(new_rax))

That would be a very weird assertion considering that
vcpu-arch.pio.size will architecturally be at most 4.

The first arm of the || is sufficient.

 +  memcpy(new_rax, vcpu, sizeof(new_rax));
 +  trace_kvm_pio(KVM_PIO_IN, vcpu-arch.pio.port, vcpu-arch.pio.size,
 +vcpu-arch.pio.count, vcpu-arch.pio_data);
 +  kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
 +  vcpu-arch.pio.count = 0;
 I think it is better to call emulator_pio_in_emulated directly, like

 emulator_pio_in_out(vcpu-arch.emulate_ctxt, vcpu-arch.pio.size,
 vcpu-arch.pio.port, new_rax, 1);
 kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);

 because we know that vcpu-arch.pio.count != 0.
 
 Pasting the same code creates bug opportunities when we forget to modify
 all places.  This class of problems can be harder to deal with, that (c)
 and (d), because we can't simply print all callers.

I agree with this and prefer calling emulator_pio_in_emulated in
complete_fast_pio_in, indeed.

 Refactoring could avoid the weird vcpu-ctxt-vcpu conversion.
 (A better name is always welcome.)

No need for that.

 The pointer chasing is making me dizzy.  I'm not sure why
 emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
 immediately translate that to a vcpu and never use the x86_emulate_ctxt,
 why not pass the vcpu in the first place?

Because the emulator is written to be usable outside the Linux kernel as
well.

Also, the fast path (used if kernel_pio returns 0) doesn't read
VCPU_REGS_RAX, thus using an uninitialized variable here:

 +   unsigned long val;
 +   int ret = emulator_pio_in_emulated(vcpu-arch.emulate_ctxt, size,
 +  port, val, 1);
 +
 +   if (ret)
 +   kvm_register_write(vcpu, VCPU_REGS_RAX, val);

Thanks,

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 02:10, Nadav Amit wrote:
 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
 index 155534c..1ef4c0d 100644
 --- a/arch/x86/kvm/svm.c
 +++ b/arch/x86/kvm/svm.c
 @@ -1195,7 +1195,7 @@ static void init_vmcb(struct vcpu_svm *svm)
   enable_gif(svm);
  }
  
 -static void svm_vcpu_reset(struct kvm_vcpu *vcpu)
 +static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
  {
   struct vcpu_svm *svm = to_svm(vcpu);
   u32 dummy;

Please move this code:

svm-vcpu.arch.apic_base = APIC_DEFAULT_PHYS_BASE |
   MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_reset_bsp(svm-vcpu))
svm-vcpu.arch.apic_base |= MSR_IA32_APICBASE_BSP;


from svm_create_vcpu to svm_vcpu_reset, so that it can be wrapped with
if (!init_event) as in the VMX case.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/2] KVM: task switch: generate #DB trap if TSS.T is set

2015-04-07 Thread Paolo Bonzini



On 28/03/2015 23:27, Eugene Korenevsky wrote:
 Emulate #DB generation on task switch if TSS.T is set according to Intel SDM.
 The processor generates a debug exception after a task switch if the T flag of
 the new task's TSS is set. This exception is generated after program control
 has passed to the new task, and prior to the execution of the first 
 instruction
 of that task. DR6.BT bit should be set to indicate this condition.
 
 Signed-off-by: Eugene Korenevsky ekorenev...@gmail.com
 ---
  arch/x86/include/asm/kvm_host.h |  1 +
  arch/x86/kvm/emulate.c  | 13 +
  arch/x86/kvm/vmx.c  |  5 -
  arch/x86/kvm/x86.c  |  4 
  4 files changed, 18 insertions(+), 5 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index a236e39..981e9ea 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -147,6 +147,7 @@ enum {
  
  #define DR6_BD   (1  13)
  #define DR6_BS   (1  14)
 +#define DR6_BT   (1  15)
  #define DR6_RTM  (1  16)
  #define DR6_FIXED_1  0xfffe0ff0
  #define DR6_INIT 0x0ff0
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index 3a494f3..4ef1c27 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -2783,6 +2783,19 @@ static int load_state_from_tss32(struct 
 x86_emulate_ctxt *ctxt,
   ret = __load_segment_descriptor(ctxt, tss-gs, VCPU_SREG_GS, cpl,
   X86_TRANSFER_TASK_SWITCH, NULL);
  
 + /*
 +  * The last thing to do is injecting #DB trap if TSS.T bit is set
 +  */
 + if (ret == X86EMUL_CONTINUE  tss-t) {
 + ulong dr6;
 +
 + ctxt-ops-get_dr(ctxt, 6, dr6);
 + dr6 |= DR6_BT | DR6_FIXED_1 | DR6_RTM;
 + ctxt-ops-set_dr(ctxt, 6, dr6);
 + ctxt-have_exception = true;
 + emulate_db(ctxt);
 + }
 +
   return ret;
  }
  
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index f7b20b4..d922fd8 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -5696,11 +5696,6 @@ static int handle_task_switch(struct kvm_vcpu *vcpu)
   /* clear all local breakpoint enable flags */
   vmcs_writel(GUEST_DR7, vmcs_readl(GUEST_DR7)  ~0x155);
  
 - /*
 -  * TODO: What about debug traps on tss switch?
 -  *   Are we supposed to inject them and update dr6?
 -  */
 -
   return 1;
  }
  
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index bd7a70b..66ac520 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -6736,6 +6736,7 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 
 tss_selector, int idt_index,
   int ret;
  
   init_emulate_ctxt(vcpu);
 + ctxt-have_exception = false;

Please instead add the statement to init_emulate_ctxt, and remove it
from x86_emulate_instruction.

  
   ret = emulator_task_switch(ctxt, tss_selector, idt_index, reason,
  has_error_code, error_code);
 @@ -6745,6 +6746,9 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 
 tss_selector, int idt_index,
  
   kvm_rip_write(vcpu, ctxt-eip);
   kvm_set_rflags(vcpu, ctxt-eflags);
 + /* Generate #DB trap if T bit is set in new TSS */
 + if (ctxt-have_exception  ctxt-exception.vector == DB_VECTOR)
 + kvm_queue_exception(vcpu, DB_VECTOR);

I think you should just call kvm_multiple_exception directly, because
it's also possible that you'd have to inject a #TS exception here.

Paolo

   kvm_make_request(KVM_REQ_EVENT, vcpu);
   return EMULATE_DONE;
  }
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] x86: vdso: fix pvclock races with task migration

2015-04-07 Thread Paolo Bonzini



On 07/04/2015 14:47, Radim Krčmář wrote:
 I think it was correct.  Both are guest only, the revert just missed
 some races.  (0a4e6be9ca17 has misleading commit message ...)

Oops.  You're right.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/4] KVM: x86: Reset fixes

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 02:10, Nadav Amit wrote:
 This set includes 2 previous patches that deal with the INIT flow that is not
 distinguished from regular boot, and allowing the VM to change BSP (which is
 used in very certain testing environments). The next 2 patches are new, 
 dealing
 with regression that cause DR0-DR3 not to be reset (even when QEMU initiates
 the RESET) and CR2 not cleared after INIT.
 
 The second patch regarding BSP requires an additional fix for QEMU, as
 otherwise reset fails. A separate patch was submitted to QEMU mailing-list.
 
 Thanks for reviewing the patches.
 
 Nadav Amit (4):
   KVM: x86: INIT and reset sequences are different
   KVM: x86: BSP in MSR_IA32_APICBASE is writable
   KVM: x86: DR0-DR3 are not clear on reset
   KVM: x86: Clear CR2 on VCPU reset
 
  arch/x86/include/asm/kvm_host.h |  7 ---
  arch/x86/kvm/lapic.c| 13 ++---
  arch/x86/kvm/lapic.h|  2 +-
  arch/x86/kvm/svm.c  |  4 ++--
  arch/x86/kvm/vmx.c  | 30 +-
  arch/x86/kvm/x86.c  | 35 +++
  include/linux/kvm_host.h|  7 ++-
  7 files changed, 63 insertions(+), 35 deletions(-)
 

Applying patches 2-4, thanks.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 7/7] vhost: feature to set the vring endianness

2015-04-07 Thread Cornelia Huck

On Tue, 07 Apr 2015 14:19:31 +0200
Greg Kurz gk...@linux.vnet.ibm.com wrote:

 This patch brings cross-endian support to vhost when used to implement
 legacy virtio devices. Since it is a relatively rare situation, the
 feature availability is controlled by a kernel config option (not set
 by default).
 
 The ioctls introduced by this patch are for legacy only: virtio 1.0
 devices are returned EPERM.
 
 Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
 ---
  drivers/vhost/Kconfig  |   10 
  drivers/vhost/vhost.c  |   55 
 
  drivers/vhost/vhost.h  |   17 +-
  include/uapi/linux/vhost.h |5 
  4 files changed, 86 insertions(+), 1 deletion(-)

 +#ifdef CONFIG_VHOST_SET_ENDIAN_LEGACY
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s;
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;
 +
 + if (copy_from_user(s, argp, sizeof(s)))
 + return -EFAULT;
 +
 + vq-legacy_is_little_endian = !!s.num;
 + return 0;
 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s = {
 + .index = idx,
 + .num = vq-legacy_is_little_endian
 + };
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;
 +
 + if (copy_to_user(argp, s, sizeof(s)))
 + return -EFAULT;
 +
 + return 0;
 +}
 +#else
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + return 0;

I'm wondering whether this handler should return an error if the
feature is not configured for the kernel? How can the userspace caller
find out whether it has successfully prompted the kernel to handle the
endianness correctly?

 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + return 0;
 +}
 +#endif /* CONFIG_VHOST_SET_ENDIAN_LEGACY */

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 02:10, Nadav Amit wrote:
 x86 architecture defines differences between the reset and INIT sequences.
 INIT does not initialize the FPU (including MMX, XMM, YMM, etc.), TSC, PMU,
 MSRs (in general), MTRRs machine-check, APIC ID, APIC arbitration ID and BSP.
 
 EFER is supposed NOT to be reset according to the SDM, but leaving the LMA and
 LME untouched causes failed VM-entry.  Therefore we reset EFER (although it is
 unclear whether the rest of EFER bits should be reset).

Do you get failed VM-entry even if LME=1, LMA=0?  LMA obviously should
be reset, but LME=1/PG=0/PAE=0 is shown as valid in Figure 4-1 of the SDM.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Paolo Bonzini



On 02/04/2015 04:17, Bandan Das wrote:
  x86 architecture defines differences between the reset and INIT sequences.
  INIT does not initialize the FPU (including MMX, XMM, YMM, etc.), TSC, PMU,
  MSRs (in general), MTRRs machine-check, APIC ID, APIC arbitration ID and 
  BSP.
 
  EFER is supposed NOT to be reset according to the SDM, but leaving the LMA 
  and
  LME untouched causes failed VM-entry.  Therefore we reset EFER (although 
  it is
  unclear whether the rest of EFER bits should be reset).
 Thanks! This was actually in my todo list. #INIT and #RESET are actually 
 separate pins
 on the processor. So, shouldn't we differentiate between the two too by having
 (*vcpu_init) and (*vcpu_reset) separate ?

I think a bool argument is good enough.  QEMU has different functions,
and init ends up doing save/reset/restore which is pretty ugly.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 7/7] vhost: feature to set the vring endianness

2015-04-07 Thread Michael S. Tsirkin

On Tue, Apr 07, 2015 at 02:19:31PM +0200, Greg Kurz wrote:
 This patch brings cross-endian support to vhost when used to implement
 legacy virtio devices. Since it is a relatively rare situation, the
 feature availability is controlled by a kernel config option (not set
 by default).
 
 The ioctls introduced by this patch are for legacy only: virtio 1.0
 devices are returned EPERM.
 
 Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com

EINVAL probably makes more sense?

 ---
  drivers/vhost/Kconfig  |   10 
  drivers/vhost/vhost.c  |   55 
 
  drivers/vhost/vhost.h  |   17 +-
  include/uapi/linux/vhost.h |5 
  4 files changed, 86 insertions(+), 1 deletion(-)
 
 Changes since v2:
 - fixed typos in Kconfig description
 - renamed vq-legacy_big_endian to vq-legacy_is_little_endian
 - vq-legacy_is_little_endian reset to default in vhost_vq_reset()
 - dropped VHOST_F_SET_ENDIAN_LEGACY feature
 - dropped struct vhost_vring_endian from the user API (re-use
   struct vhost_vring_state instead)
 - added VHOST_GET_VRING_ENDIAN_LEGACY ioctl
 - introduced more helpers and stubs to avoid polluting the code with ifdefs
 
 
 diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
 index 017a1e8..0aec88c 100644
 --- a/drivers/vhost/Kconfig
 +++ b/drivers/vhost/Kconfig
 @@ -32,3 +32,13 @@ config VHOST
   ---help---
 This option is selected by any driver which needs to access
 the core of vhost.
 +
 +config VHOST_SET_ENDIAN_LEGACY
 + bool Cross-endian support for host kernel accelerator
 + default n
 + ---help---
 +   This option allows vhost to support guests with a different byte
 +   ordering from host. It is disabled by default since it adds overhead
 +   and it is only needed by a few platforms (powerpc and arm).
 +
 +   If unsure, say N.
 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index 2ee2826..3529a3c 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -199,6 +199,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
   vq-call = NULL;
   vq-log_ctx = NULL;
   vq-memory = NULL;
 + vq-legacy_is_little_endian = virtio_legacy_is_little_endian();
  }
  
  static int vhost_worker(void *data)
 @@ -630,6 +631,54 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
 vhost_memory __user *m)
   return 0;
  }
  
 +#ifdef CONFIG_VHOST_SET_ENDIAN_LEGACY
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s;
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;

EINVAL probably makes more sense? But I'm not sure this
is helpful: one can set VIRTIO_F_VERSION_1 afterwards,
and your patch does not seem to detect this.



 +
 + if (copy_from_user(s, argp, sizeof(s)))
 + return -EFAULT;
 +
 + vq-legacy_is_little_endian = !!s.num;
 + return 0;
 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s = {
 + .index = idx,
 + .num = vq-legacy_is_little_endian
 + };
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;
 +
 + if (copy_to_user(argp, s, sizeof(s)))
 + return -EFAULT;
 +
 + return 0;
 +}
 +#else
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + return 0;
 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + return 0;
 +}

Should be -ENOIOCTLCMD?

 +#endif /* CONFIG_VHOST_SET_ENDIAN_LEGACY */
 +
  long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
  {
   struct file *eventfp, *filep = NULL;
 @@ -806,6 +855,12 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, 
 void __user *argp)
   } else
   filep = eventfp;
   break;
 + case VHOST_SET_VRING_ENDIAN_LEGACY:
 + r = vhost_set_vring_endian_legacy(vq, argp);
 + break;
 + case VHOST_GET_VRING_ENDIAN_LEGACY:
 + r = vhost_get_vring_endian_legacy(vq, idx, argp);
 + break;
   default:
   r = -ENOIOCTLCMD;
   }

I think we also want to forbid this with a running backend.

 diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
 index 4e9a186..981ba06 100644
 --- a/drivers/vhost/vhost.h
 +++ b/drivers/vhost/vhost.h
 @@ -106,6 +106,9 @@ struct vhost_virtqueue {
   /* Log write descriptors */
   void __user *log_base;
   struct vhost_log *log;
 +
 + /* We need to know the

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Bandan Das

Paolo Bonzini pbonz...@redhat.com writes:

 On 07/04/2015 18:17, Bandan Das wrote:
  I think a bool argument is good enough.  QEMU has different functions,
  and init ends up doing save/reset/restore which is pretty ugly.
 
 Right, I meant that init could just be a wrapper so that it atleast shows up 
 in
 a backtrace - could be helpful for debugging.

 I suspect that the compiler would inline any sensible implementation and
 it wouldn't show up in the backtraces. :(

noinline ? :) Anyway, it's probably not worth the trouble, that could be easily
figured out.

Thanks,
Bandan

 Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 6/7] virtio: add explicit big-endian support to memory accessors

2015-04-07 Thread Michael S. Tsirkin

On Tue, Apr 07, 2015 at 02:15:52PM +0200, Greg Kurz wrote:
 The current memory accessors logic is:
 - little endian if little_endian
 - native endian (i.e. no byteswap) if !little_endian
 
 If we want to fully support cross-endian vhost, we also need to be
 able to convert to big endian.
 
 Instead of changing the little_endian argument to some 3-value enum, this
 patch changes the logic to:
 - little endian if little_endian
 - big endian if !little_endian
 
 The native endian case is handled by all users with a trivial helper. This
 patch doesn't change any functionality, nor it does add overhead.
 
 Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
 ---
  drivers/net/macvtap.c|4 +++-
  drivers/net/tun.c|4 +++-
  drivers/vhost/vhost.h|4 +++-
  include/linux/virtio_byteorder.h |   24 ++--
  include/linux/virtio_config.h|4 +++-
  include/linux/vringh.h   |4 +++-
  6 files changed, 29 insertions(+), 15 deletions(-)
 
 diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
 index a2f2958..0a03a66 100644
 --- a/drivers/net/macvtap.c
 +++ b/drivers/net/macvtap.c
 @@ -51,7 +51,9 @@ struct macvtap_queue {
  
  static inline bool macvtap_is_little_endian(struct macvtap_queue *q)
  {
 - return q-flags  MACVTAP_VNET_LE;
 + if (q-flags  MACVTAP_VNET_LE)
 + return true;
 + return virtio_legacy_is_little_endian();
  }
  
  static inline u16 macvtap16_to_cpu(struct macvtap_queue *q, __virtio16 val)

Hmm I'm not sure how well this will work once you
actually make it dynamic.
Remains to be seen.

 diff --git a/drivers/net/tun.c b/drivers/net/tun.c
 index 3c3d6c0..053f9b6 100644
 --- a/drivers/net/tun.c
 +++ b/drivers/net/tun.c
 @@ -208,7 +208,9 @@ struct tun_struct {
  
  static inline bool tun_is_little_endian(struct tun_struct *tun)
  {
 - return tun-flags  TUN_VNET_LE;
 + if (tun-flags  TUN_VNET_LE)
 + return true;
 + return virtio_legacy_is_little_endian();
  }
  
  static inline u16 tun16_to_cpu(struct tun_struct *tun, __virtio16 val)
 diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
 index 6a49960..4e9a186 100644
 --- a/drivers/vhost/vhost.h
 +++ b/drivers/vhost/vhost.h
 @@ -175,7 +175,9 @@ static inline bool vhost_has_feature(struct 
 vhost_virtqueue *vq, int bit)
  
  static inline bool vhost_is_little_endian(struct vhost_virtqueue *vq)
  {
 - return vhost_has_feature(vq, VIRTIO_F_VERSION_1);
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return true;
 + return virtio_legacy_is_little_endian();
  }
  
  /* Memory accessors */
 diff --git a/include/linux/virtio_byteorder.h 
 b/include/linux/virtio_byteorder.h
 index 51865d0..ce63a2c 100644
 --- a/include/linux/virtio_byteorder.h
 +++ b/include/linux/virtio_byteorder.h
 @@ -3,17 +3,21 @@
  #include linux/types.h
  #include uapi/linux/virtio_types.h
  
 -/*
 - * Low-level memory accessors for handling virtio in modern little endian 
 and in
 - * compatibility native endian format.
 - */
 +static inline bool virtio_legacy_is_little_endian(void)
 +{
 +#ifdef __LITTLE_ENDIAN
 + return true;
 +#else
 + return false;
 +#endif
 +}
  
  static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
  {
   if (little_endian)
   return le16_to_cpu((__force __le16)val);
   else
 - return (__force u16)val;
 + return be16_to_cpu((__force __be16)val);
  }
  
  static inline __virtio16 __cpu_to_virtio16(bool little_endian, u16 val)
 @@ -21,7 +25,7 @@ static inline __virtio16 __cpu_to_virtio16(bool 
 little_endian, u16 val)
   if (little_endian)
   return (__force __virtio16)cpu_to_le16(val);
   else
 - return (__force __virtio16)val;
 + return (__force __virtio16)cpu_to_be16(val);
  }
  
  static inline u32 __virtio32_to_cpu(bool little_endian, __virtio32 val)
 @@ -29,7 +33,7 @@ static inline u32 __virtio32_to_cpu(bool little_endian, 
 __virtio32 val)
   if (little_endian)
   return le32_to_cpu((__force __le32)val);
   else
 - return (__force u32)val;
 + return be32_to_cpu((__force __be32)val);
  }
  
  static inline __virtio32 __cpu_to_virtio32(bool little_endian, u32 val)
 @@ -37,7 +41,7 @@ static inline __virtio32 __cpu_to_virtio32(bool 
 little_endian, u32 val)
   if (little_endian)
   return (__force __virtio32)cpu_to_le32(val);
   else
 - return (__force __virtio32)val;
 + return (__force __virtio32)cpu_to_be32(val);
  }
  
  static inline u64 __virtio64_to_cpu(bool little_endian, __virtio64 val)
 @@ -45,7 +49,7 @@ static inline u64 __virtio64_to_cpu(bool little_endian, 
 __virtio64 val)
   if (little_endian)
   return le64_to_cpu((__force __le64)val);
   else
 - return (__force u64)val;
 + return be64_to_cpu((__force __be64)val);
  }
  
  static inline

Re: [GIT PULL 0/5] arm/arm64: KVM: Fixes for KVM for 4.0-rc5

2015-04-07 Thread Paolo Bonzini



On 16/03/2015 13:54, Christoffer Dall wrote:
 Hi Marcelo and Paolo,
 
 Please pull the following fixes for KVM/ARM for 4.0-rc5.
 
 They fix page refcounting issues in our Stage-2 page table management code, a
 missing unlock in a gicv3 error path, and a race that can cause lost 
 interrupts
 if signals are pending just prior to entering the guest.
 
 The following changes since commit bfb8fb4775d3397908ae3a7ff65807097d81d713:
 
   Merge tag 'kvm-s390-master-20150303' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux (2015-03-05 
 14:42:48 -0300)
 
 are available in the git repository at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git 
 kvm-arm-fixes-4.0-rc5
 
 for you to fetch changes up to ae705930fca6322600690df9dc1c7d0516145a93:
 
   arm/arm64: KVM: Keep elrsr/aisr in sync with software model (2015-03-14 
 13:42:07 +0100)
 
 
 Thanks,
 -Christoffer
 
 ---
 
 Christoffer Dall (1):
   arm/arm64: KVM: Keep elrsr/aisr in sync with software model
 
 Marc Zyngier (3):
   arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting
   arm64: KVM: Do not use pgd_index to index stage-2 pgd
   arm64: KVM: Fix outdated comment about VTCR_EL2.PS
 
 Wei Yongjun (1):
   arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()
 
  arch/arm/include/asm/kvm_mmu.h   | 13 ---
  arch/arm/kvm/mmu.c   | 75 
 
  arch/arm64/include/asm/kvm_arm.h |  5 +--
  arch/arm64/include/asm/kvm_mmu.h | 48 -
  include/kvm/arm_vgic.h   |  1 +
  virt/kvm/arm/vgic-v2.c   |  8 +
  virt/kvm/arm/vgic-v3.c   |  8 +
  virt/kvm/arm/vgic.c  | 22 ++--
  8 files changed, 105 insertions(+), 75 deletions(-)
 

I pulled this into kvm/next now.  The conflict in virt/kvm/arm/vgic-v3.c
is internal to the KVM tree, so I'll take care of the resolution myself.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] kvm: mmu: lazy collapse small sptes into large sptes

2015-04-07 Thread Paolo Bonzini



On 03/04/2015 09:40, Wanpeng Li wrote:
 There are two scenarios for the requirement of collapsing small sptes
 into large sptes.
 - dirty logging tracks sptes in 4k granularity, so large sptes are split,
   the large sptes will be reallocated in the destination machine and the
   guest in the source machine will be destroyed when live migration 
 successfully.
   However, the guest in the source machine will continue to run if live 
 migration
   fail due to some reasons, the sptes still keep small which lead to bad
   performance.
 - our customers write tools to track the dirty speed of guests by EPT D 
 bit/PML
   in order to determine the most appropriate one to be live migrated, however
   sptes will still keep small after tracking dirty speed.
 
 This patch introduce lazy collapse small sptes into large sptes, the memory 
 region
 will be scanned on the ioctl context when dirty log is stopped, the ones 
 which can
 be collapsed into large pages will be dropped during the scan, it depends the 
 on
 later #PF to reallocate all large sptes.
 
 Reviewed-by: Xiao Guangrong guangrong.x...@linux.intel.com
 Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
 ---
 v2 - v3:
  * update comments 
  * fix infinite for loop
 v1 - v2:
  * use 'bool' instead of 'int'
  * add more comments
  * fix can not get the next spte after drop the current spte
 
  arch/x86/include/asm/kvm_host.h |  2 ++
  arch/x86/kvm/mmu.c  | 73 
 +
  arch/x86/kvm/x86.c  | 19 +++
  3 files changed, 94 insertions(+)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index 30b28dc..91b5bdb 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -854,6 +854,8 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
 accessed_mask,
  void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
  void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 struct kvm_memory_slot *memslot);
 +void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 + struct kvm_memory_slot *memslot);
  void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
  void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index cee7592..ba002a0 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -4465,6 +4465,79 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
   kvm_flush_remote_tlbs(kvm);
  }
  
 +static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 + unsigned long *rmapp)
 +{
 + u64 *sptep;
 + struct rmap_iterator iter;
 + int need_tlb_flush = 0;
 + pfn_t pfn;
 + struct kvm_mmu_page *sp;
 +
 + for (sptep = rmap_get_first(*rmapp, iter); sptep;) {
 + BUG_ON(!(*sptep  PT_PRESENT_MASK));
 +
 + sp = page_header(__pa(sptep));
 + pfn = spte_to_pfn(*sptep);
 +
 + /*
 +  * Lets support EPT only for now, there still needs to figure
 +  * out an efficient way to let these codes be aware what mapping
 +  * level used in guest.
 +  */
 + if (sp-role.direct 
 + !kvm_is_reserved_pfn(pfn) 
 + PageTransCompound(pfn_to_page(pfn))) {
 + drop_spte(kvm, sptep);
 + sptep = rmap_get_first(*rmapp, iter);
 + need_tlb_flush = 1;
 + } else
 + sptep = rmap_get_next(iter);
 + }
 +
 + return need_tlb_flush;
 +}
 +
 +void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 + struct kvm_memory_slot *memslot)
 +{
 + bool flush = false;
 + unsigned long *rmapp;
 + unsigned long last_index, index;
 + gfn_t gfn_start, gfn_end;
 +
 + spin_lock(kvm-mmu_lock);
 +
 + gfn_start = memslot-base_gfn;
 + gfn_end = memslot-base_gfn + memslot-npages - 1;
 +
 + if (gfn_start = gfn_end)
 + goto out;
 +
 + rmapp = memslot-arch.rmap[0];
 + last_index = gfn_to_index(gfn_end, memslot-base_gfn,
 + PT_PAGE_TABLE_LEVEL);
 +
 + for (index = 0; index = last_index; ++index, ++rmapp) {
 + if (*rmapp)
 + flush |= kvm_mmu_zap_collapsible_spte(kvm, rmapp);
 +
 + if (need_resched() || spin_needbreak(kvm-mmu_lock)) {
 + if (flush) {
 + kvm_flush_remote_tlbs(kvm);
 + flush = false;
 + }
 + cond_resched_lock(kvm-mmu_lock);
 + }
 + }
 +
 + if (flush)
 + kvm_flush_remote_tlbs(kvm);
 +
 +out:
 + spin_unlock(kvm-mmu_lock);
 +}
 +
  void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,

Re: [PATCH v3 0/7] vhost: support for cross endian guests

2015-04-07 Thread Michael S. Tsirkin

On Tue, Apr 07, 2015 at 02:09:29PM +0200, Greg Kurz wrote:
 Hi,
 
 This patchset allows vhost to be used with legacy virtio when guest and host
 have a different endianness.
 
 Patches 1-6 remain the same as the previous post. Patch 7 was heavily changed
 according to MST's comments.

This still doesn't actually work, right?
tun and macvtap need new ioctls too ...

 ---
 
 Greg Kurz (7):
   virtio: introduce virtio_is_little_endian() helper
   tun: add tun_is_little_endian() helper
   macvtap: introduce macvtap_is_little_endian() helper
   vringh: introduce vringh_is_little_endian() helper
   vhost: introduce vhost_is_little_endian() helper
   virtio: add explicit big-endian support to memory accessors
   vhost: feature to set the vring endianness
 
 
  drivers/net/macvtap.c|   11 ++--
  drivers/net/tun.c|   11 ++--
  drivers/vhost/Kconfig|   10 +++
  drivers/vhost/vhost.c|   55 
 ++
  drivers/vhost/vhost.h|   34 +++
  include/linux/virtio_byteorder.h |   24 ++---
  include/linux/virtio_config.h|   19 +
  include/linux/vringh.h   |   19 +
  include/uapi/linux/vhost.h   |5 +++
  9 files changed, 156 insertions(+), 32 deletions(-)
 
 --
 Greg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Bandan Das

Paolo Bonzini pbonz...@redhat.com writes:

 On 02/04/2015 04:17, Bandan Das wrote:
  x86 architecture defines differences between the reset and INIT sequences.
  INIT does not initialize the FPU (including MMX, XMM, YMM, etc.), TSC, 
  PMU,
  MSRs (in general), MTRRs machine-check, APIC ID, APIC arbitration ID and 
  BSP.
 
  EFER is supposed NOT to be reset according to the SDM, but leaving the 
  LMA and
  LME untouched causes failed VM-entry.  Therefore we reset EFER (although 
  it is
  unclear whether the rest of EFER bits should be reset).
 Thanks! This was actually in my todo list. #INIT and #RESET are actually 
 separate pins
 on the processor. So, shouldn't we differentiate between the two too by 
 having
 (*vcpu_init) and (*vcpu_reset) separate ?

 I think a bool argument is good enough.  QEMU has different functions,
 and init ends up doing save/reset/restore which is pretty ugly.

Right, I meant that init could just be a wrapper so that it atleast shows up in
a backtrace - could be helpful for debugging.

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/7] vhost: support for cross endian guests

2015-04-07 Thread Greg Kurz

On Tue, 7 Apr 2015 17:55:08 +0200
Michael S. Tsirkin m...@redhat.com wrote:

 On Tue, Apr 07, 2015 at 02:09:29PM +0200, Greg Kurz wrote:
  Hi,
  
  This patchset allows vhost to be used with legacy virtio when guest and host
  have a different endianness.
  
  Patches 1-6 remain the same as the previous post. Patch 7 was heavily 
  changed
  according to MST's comments.
 
 This still doesn't actually work, right?
 tun and macvtap need new ioctls too ...
 

Yes they do. I already have a patch but I wasn't sure if I should send it
along this series... Since it looks like there will be a v4, I'll add the
tun/macvtap patch.

Thanks.

--
Greg

  ---
  
  Greg Kurz (7):
virtio: introduce virtio_is_little_endian() helper
tun: add tun_is_little_endian() helper
macvtap: introduce macvtap_is_little_endian() helper
vringh: introduce vringh_is_little_endian() helper
vhost: introduce vhost_is_little_endian() helper
virtio: add explicit big-endian support to memory accessors
vhost: feature to set the vring endianness
  
  
   drivers/net/macvtap.c|   11 ++--
   drivers/net/tun.c|   11 ++--
   drivers/vhost/Kconfig|   10 +++
   drivers/vhost/vhost.c|   55 
  ++
   drivers/vhost/vhost.h|   34 +++
   include/linux/virtio_byteorder.h |   24 ++---
   include/linux/virtio_config.h|   19 +
   include/linux/vringh.h   |   19 +
   include/uapi/linux/vhost.h   |5 +++
   9 files changed, 156 insertions(+), 32 deletions(-)
  
  --
  Greg
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] KVM: x86: INIT and reset sequences are different

2015-04-07 Thread Paolo Bonzini



On 07/04/2015 18:17, Bandan Das wrote:
  I think a bool argument is good enough.  QEMU has different functions,
  and init ends up doing save/reset/restore which is pretty ugly.
 
 Right, I meant that init could just be a wrapper so that it atleast shows up 
 in
 a backtrace - could be helpful for debugging.

I suspect that the compiler would inline any sensible implementation and
it wouldn't show up in the backtraces. :(

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 7/7] vhost: feature to set the vring endianness

2015-04-07 Thread Michael S. Tsirkin

On Tue, Apr 07, 2015 at 02:19:31PM +0200, Greg Kurz wrote:
 This patch brings cross-endian support to vhost when used to implement
 legacy virtio devices. Since it is a relatively rare situation, the
 feature availability is controlled by a kernel config option (not set
 by default).
 
 The ioctls introduced by this patch are for legacy only: virtio 1.0
 devices are returned EPERM.
 
 Signed-off-by: Greg Kurz gk...@linux.vnet.ibm.com
 ---
  drivers/vhost/Kconfig  |   10 
  drivers/vhost/vhost.c  |   55 
 
  drivers/vhost/vhost.h  |   17 +-
  include/uapi/linux/vhost.h |5 
  4 files changed, 86 insertions(+), 1 deletion(-)
 
 Changes since v2:
 - fixed typos in Kconfig description
 - renamed vq-legacy_big_endian to vq-legacy_is_little_endian
 - vq-legacy_is_little_endian reset to default in vhost_vq_reset()
 - dropped VHOST_F_SET_ENDIAN_LEGACY feature
 - dropped struct vhost_vring_endian from the user API (re-use
   struct vhost_vring_state instead)
 - added VHOST_GET_VRING_ENDIAN_LEGACY ioctl
 - introduced more helpers and stubs to avoid polluting the code with ifdefs
 
 
 diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
 index 017a1e8..0aec88c 100644
 --- a/drivers/vhost/Kconfig
 +++ b/drivers/vhost/Kconfig
 @@ -32,3 +32,13 @@ config VHOST
   ---help---
 This option is selected by any driver which needs to access
 the core of vhost.
 +
 +config VHOST_SET_ENDIAN_LEGACY
 + bool Cross-endian support for host kernel accelerator
 + default n
 + ---help---
 +   This option allows vhost to support guests with a different byte
 +   ordering from host. It is disabled by default since it adds overhead
 +   and it is only needed by a few platforms (powerpc and arm).
 +
 +   If unsure, say N.
 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index 2ee2826..3529a3c 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -199,6 +199,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
   vq-call = NULL;
   vq-log_ctx = NULL;
   vq-memory = NULL;
 + vq-legacy_is_little_endian = virtio_legacy_is_little_endian();
  }
  
  static int vhost_worker(void *data)
 @@ -630,6 +631,54 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
 vhost_memory __user *m)
   return 0;
  }
  
 +#ifdef CONFIG_VHOST_SET_ENDIAN_LEGACY
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s;
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;
 +
 + if (copy_from_user(s, argp, sizeof(s)))
 + return -EFAULT;
 +
 + vq-legacy_is_little_endian = !!s.num;
 + return 0;
 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + struct vhost_vring_state s = {
 + .index = idx,
 + .num = vq-legacy_is_little_endian
 + };
 +
 + if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
 + return -EPERM;
 +
 + if (copy_to_user(argp, s, sizeof(s)))
 + return -EFAULT;
 +
 + return 0;
 +}
 +#else
 +static long vhost_set_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   void __user *argp)
 +{
 + return 0;
 +}
 +
 +static long vhost_get_vring_endian_legacy(struct vhost_virtqueue *vq,
 +   u32 idx,
 +   void __user *argp)
 +{
 + return 0;
 +}
 +#endif /* CONFIG_VHOST_SET_ENDIAN_LEGACY */
 +
  long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
  {
   struct file *eventfp, *filep = NULL;
 @@ -806,6 +855,12 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, 
 void __user *argp)
   } else
   filep = eventfp;
   break;
 + case VHOST_SET_VRING_ENDIAN_LEGACY:
 + r = vhost_set_vring_endian_legacy(vq, argp);
 + break;
 + case VHOST_GET_VRING_ENDIAN_LEGACY:
 + r = vhost_get_vring_endian_legacy(vq, idx, argp);
 + break;
   default:
   r = -ENOIOCTLCMD;
   }
 diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
 index 4e9a186..981ba06 100644
 --- a/drivers/vhost/vhost.h
 +++ b/drivers/vhost/vhost.h
 @@ -106,6 +106,9 @@ struct vhost_virtqueue {
   /* Log write descriptors */
   void __user *log_base;
   struct vhost_log *log;
 +
 + /* We need to know the device endianness with legacy virtio. */
 + bool legacy_is_little_endian;
  };
  
  struct vhost_dev {
 @@ -173,11 +176,23 @@ static inline bool vhost_has_feature(struct 
 vhost_virtqueue *vq, int bit)
   return vq-acked_features  (1ULL  bit);
  }
  
 +#ifdef

[PATCH] KVM: dirty all pages in kvm_write_guest_cached()

2015-04-07 Thread Radim Krčmář

We dirtied only one page because writes originally couldn't span more.
Use improved syntax for ' PAGE_SHIFT' while at it.

Fixes: 8f964525a121 (KVM: Allow cross page reads and writes from cached 
translations.)
Signed-off-by: Radim Krčmář rkrc...@redhat.com
---
 The function handles cross memslot writes in a different path.

 I think we should dirty pages after partial writes too (r  len),
 but it probably won't happen and I already started refactoring :)

 virt/kvm/kvm_main.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aadef264bed1..863df9dcab6f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1665,6 +1665,7 @@ int kvm_write_guest_cached(struct kvm *kvm, struct 
gfn_to_hva_cache *ghc,
 {
struct kvm_memslots *slots = kvm_memslots(kvm);
int r;
+   gfn_t gfn;
 
BUG_ON(len  ghc-len);
 
@@ -1680,7 +1681,10 @@ int kvm_write_guest_cached(struct kvm *kvm, struct 
gfn_to_hva_cache *ghc,
r = __copy_to_user((void __user *)ghc-hva, data, len);
if (r)
return -EFAULT;
-   mark_page_dirty_in_slot(kvm, ghc-memslot, ghc-gpa  PAGE_SHIFT);
+
+   for (gfn =  gpa_to_gfn(ghc-gpa);
+gfn = gpa_to_gfn(ghc-gpa + len - 1); gfn++)
+   mark_page_dirty_in_slot(kvm, ghc-memslot, gfn);
 
return 0;
 }
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/20] MIPS: Clear [MSA]FPE CSR.Cause after notify_die()

2015-04-07 Thread Maciej W. Rozycki

On Wed, 11 Mar 2015, James Hogan wrote:

 When handling floating point exceptions (FPEs) and MSA FPEs the Cause
 bits of the appropriate control and status register (FCSR for FPEs and
 MSACSR for MSA FPEs) are read and cleared before enabling interrupts,
 presumably so that it doesn't have to go through the pain of restoring
 those bits if the process is pre-empted, since writing those bits would
 cause another immediate exception while still in the kernel.

 Another reason is MIPS I processors (and for the record I believe the 
R6000 MIPS II implementation as well) signal FPA exceptions using an 
ordinary interrupt, seen on one of the IP[7:2] bits in the CP0 Cause 
register.  Consequently reenabling interrupts without first clearing at 
least all the unmasked FCSR.Cause bits would retrigger the interrupt and 
cause an infinite loop.

 We don't ever mask the FPA interrupt with the relevant IM[7:2] bit in the 
CP0 Status register, because for simplicity we reuse the whole of the FPE 
exception handling path for FPA interrupts as well.  Except that we enter 
it via the `handle_fpe_int' alternative entry point rather than the usual 
`handle_fpe' one, bypassing the register save sequence as it was already 
done by the interrupt handler.

  Maciej
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3.13.y-ckt 022/156] KVM: MIPS: Fix trace event to save PC directly

2015-04-07 Thread Kamal Mostafa

3.13.11-ckt19 -stable review patch.  If anyone has any objections, please let 
me know.

--

From: James Hogan james.ho...@imgtec.com

commit b3cffac04eca9af46e1e23560a8ee22b1bd36d43 upstream.

Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.

Lets save the actual PC in the structure so that the correct value is
accessible later.

Fixes: 669e846e6c4e (KVM/MIPS32: MIPS arch specific APIs for KVM)
Signed-off-by: James Hogan james.ho...@imgtec.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Ralf Baechle r...@linux-mips.org
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Gleb Natapov g...@kernel.org
Cc: Steven Rostedt rost...@goodmis.org
Cc: Ingo Molnar mi...@redhat.com
Cc: linux-m...@linux-mips.org
Cc: kvm@vger.kernel.org
Acked-by: Steven Rostedt rost...@goodmis.org
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Kamal Mostafa ka...@canonical.com
---
 arch/mips/kvm/trace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/kvm/trace.h b/arch/mips/kvm/trace.h
index bc9e0f4..e51621e 100644
--- a/arch/mips/kvm/trace.h
+++ b/arch/mips/kvm/trace.h
@@ -26,18 +26,18 @@ TRACE_EVENT(kvm_exit,
TP_PROTO(struct kvm_vcpu *vcpu, unsigned int reason),
TP_ARGS(vcpu, reason),
TP_STRUCT__entry(
-   __field(struct kvm_vcpu *, vcpu)
+   __field(unsigned long, pc)
__field(unsigned int, reason)
),
 
TP_fast_assign(
-   __entry-vcpu = vcpu;
+   __entry-pc = vcpu-arch.pc;
__entry-reason = reason;
),
 
TP_printk([%s]PC: 0x%08lx,
  kvm_mips_exit_types_str[__entry-reason],
- __entry-vcpu-arch.pc)
+ __entry-pc)
 );
 
 #endif /* _TRACE_KVM_H */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Correct way to enable BusMaster with VFIO?

2015-04-07 Thread Wei Hu

Hi Alex,

With your change Release devices with BusMaster disabled, I've found
that my VFIO device driver is no longer receiving MSI interrupts.
After reviewing the code I think it makes sense. But I had two
questions below while debugging my issue.

1.  If I had set the bus master bit in the command register by hand
before opening the vfio device, the kernel would actually leave
BusMaster enabled. This seems to contradict the call to
pci_clear_master() from vfio_pci_enable(). What's going on here, is
something else enabling BusMaster?

2. What's the recommended way to enable BusMaster with your change
now? Should my driver map the config space region and set the
BusMaster bit? Or should I have a separate command to enable the bit
before opening the device?

Thank you,
Wei
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

41 matches

Mail list logo