About preemption timer

2013-12-17 Thread Arthur Chunqi Li
Hi Jan and Paolo,

I've tried to use preemption timer in KVM to trap vcpu regularly, but
there's something unexpected. I run a VM with 4 vcpus and give them
the same preemption timer value (e.g. 100) with all bits set
(activate/save bits), then reset the value in preemption time-out
handler.

Thus I expected these vcpus trap regularly in some special turns. But
I found that when the VM is not busy, some vcpus are trapped much less
frequently than others. In Intel SDM, I noticed that preemption timer
is only related to TSC, and I think all the vcpus should trap in a
similar frequency.

Could u help me explain this phenomenon?

Thanks,
Arthur

-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem after update windows VirtIO drivers

2013-12-17 Thread Carlos Rodrigues
The problem still the same.

Regards,

-- 
Carlos Rodrigues 

Engenheiro de Software Sénior

Eurotux Informática, S.A. | www.eurotux.com

(t) +351 253 680 300 (m) +351 911 926 110


On Sáb, 2013-12-14 at 08:30 +1100, Vadim Rozenfeld wrote:
 On Fri, 2013-12-13 at 14:35 +, Carlos Rodrigues wrote:
  Another test that i made was, if i have 1 vCPU the problem is
  reproducible, but if i increase to 2 vCPU, the Windows Server reboot
  without any problem.
  
  Regards,
   
 Can you try 1 vCPU without virtio-serial?
 
 Thanks,
 Vadim.
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Updated Elvis Upstreaming Roadmap

2013-12-17 Thread Razya Ladelsky
Hi,

Thank you all for your comments.
I'm sorry for taking this long to reply, I was away on vacation..

It was a good, long discussion, many issues were raised, which we'd like 
to address with the following proposed roadmap for Elvis patches.
In general, we believe it would be best to start with patches that are 
as simple as possible, providing the basic Elvis functionality, 
and attend to the more complicated issues in subsequent patches.

Here's the road map for Elvis patches: 

1. Shared vhost thread for multiple devices.

The way to go here, we believe, is to start with a patch having a shared 
vhost thread for multiple devices of the SAME vm.
The next step/patch may be handling vms belonging to the same cgroup.

Finally, we need to extend the functionality so that the shared vhost 
thread 
serves multiple vms (not necessarily belonging to the same cgroup).

There was a lot of discussion about the way to address the enforcement 
of cgroup policies, and we will consider the various solutions with a 
future
patch.

2. Creation of vhost threads

We suggested two ways of controlling the creation and removal of vhost
threads: 
- statically determining the maximum number of virtio devices per worker 
via a kernel module parameter 
- dynamically: Sysfs mechanism to add and remove vhost threads 

It seems that it would be simplest to take the static approach as
a first stage. At a second stage (next patch), we'll advance to 
dynamically 
changing the number of vhost threads, using the static module parameter 
only as a default value. 

Regarding cwmq, it is an interesting mechanism, which we need to explore 
further.
At the moment we prefer not to change the vhost model to use cwmq, as some 
of the issues that were discussed, such as cgroups, are not supported by 
cwmq, and this is adding more complexity.
However, we'll look further into it, and consider it at a later stage.

3. Adding polling mode to vhost 

It is a good idea making polling adaptive based on various factors such as 
the I/O rate, the guest kick overhead(which is the tradeoff of polling), 
or the amount of wasted cycles (cycles we kept polling but no new work was 
added).
However, as a beginning polling patch, we would prefer having a naive 
polling approach, which could be tuned with later patches.

4. vhost statistics 

The issue that was raised for the vhost statistics was using ftrace 
instead of the debugfs mechanism.
However, looking further into the kvm stat mechanism, we learned that 
ftrace didn't replace the plain debugfs mechanism, but was used in 
addition to it.
 
We propose to continue using debugfs for statistics, in a manner similar 
to kvm,
and at some point in the future ftrace can be added to vhost as well.
 
Does this plan look o.k.?
If there are no further comments, I'll start preparing the patches 
according to what we've agreed on thus far.
Thank you,
Razya

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/15] emulator: Exclude test_lgdt_lidt from building

2013-12-17 Thread Paolo Bonzini
Il 16/12/2013 10:57, Jan Kiszka ha scritto:
 See commit 47c1461a5.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  x86/emulator.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/x86/emulator.c b/x86/emulator.c
 index 68d2b93..4e70e8f 100644
 --- a/x86/emulator.c
 +++ b/x86/emulator.c
 @@ -843,6 +843,7 @@ static void test_string_io_mmio(volatile uint8_t *mem)
   report(string_io_mmio, mmio[1023] == 0x99);
  }
  
 +/* kvm doesn't allow lidt/lgdt from mmio, so the test is disabled
  static void test_lgdt_lidt(volatile uint8_t *mem)
  {
  struct descriptor_table_ptr orig, fresh = {};
 @@ -871,6 +872,7 @@ static void test_lgdt_lidt(volatile uint8_t *mem)
  sti();
  report(lidt (long address), orig.limit == fresh.limit  orig.base == 
 fresh.base);
  }
 +*/

I'm changing this to #if 0 ... #endif when applying.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/15] lib/x86/smp: Fix compiler warnings

2013-12-17 Thread Paolo Bonzini
Il 16/12/2013 10:57, Jan Kiszka ha scritto:
 Add missing include of desc.h for prototypes of setup_idt and
 set_idt_entry and cast away the volatile of ipi_data - it's not volatile
 while we run the IPI handler.

Why not?

Paolo

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  lib/x86/smp.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/lib/x86/smp.c b/lib/x86/smp.c
 index d4c8106..75ac081 100644
 --- a/lib/x86/smp.c
 +++ b/lib/x86/smp.c
 @@ -3,6 +3,7 @@
  #include smp.h
  #include apic.h
  #include fwcfg.h
 +#include desc.h
  
  #define IPI_VECTOR 0x20
  
 @@ -18,7 +19,7 @@ static int _cpu_count;
  static __attribute__((used)) void ipi()
  {
  void (*function)(void *data) = ipi_function;
 -void *data = ipi_data;
 +void *data = (void *)ipi_data;
  bool wait = ipi_wait;
  
  if (!wait) {
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/15] lib/x86/smp: Fix compiler warnings

2013-12-17 Thread Paolo Bonzini
Il 16/12/2013 10:57, Jan Kiszka ha scritto:
 Add missing include of desc.h for prototypes of setup_idt and
 set_idt_entry and cast away the volatile of ipi_data - it's not volatile
 while we run the IPI handler.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com

The right fix is to change the declaration from

  static volatile void *ipi_data;  // Pointer to volatile void

to

  static void *volatile ipi_data;  // Volatile pointer to void

Paolo


 ---
  lib/x86/smp.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/lib/x86/smp.c b/lib/x86/smp.c
 index d4c8106..75ac081 100644
 --- a/lib/x86/smp.c
 +++ b/lib/x86/smp.c
 @@ -3,6 +3,7 @@
  #include smp.h
  #include apic.h
  #include fwcfg.h
 +#include desc.h
  
  #define IPI_VECTOR 0x20
  
 @@ -18,7 +19,7 @@ static int _cpu_count;
  static __attribute__((used)) void ipi()
  {
  void (*function)(void *data) = ipi_function;
 -void *data = ipi_data;
 +void *data = (void *)ipi_data;
  bool wait = ipi_wait;
  
  if (!wait) {
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests] svm: fix warning

2013-12-17 Thread Paolo Bonzini
x86/svm.c:534:18: warning: variable ‘data’ set but not used

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 x86/svm.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/x86/svm.c b/x86/svm.c
index d51e7ec..9d910ae 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -11,7 +11,7 @@ u64 *pml4e;
 u64 *pdpe;
 u64 *pde[4];
 u64 *pte[2048];
-u64 *scratch_page;
+void *scratch_page;
 
 #define LATENCY_RUNS 100
 
@@ -531,9 +531,7 @@ static void npt_us_prepare(struct test *test)
 
 static void npt_us_test(struct test *test)
 {
-volatile u64 data;
-
-data = *scratch_page;
+(void) *(volatile u64 *)scratch_page;
 }
 
 static bool npt_us_check(struct test *test)
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How virtio-blk latency can be measured

2013-12-17 Thread Stefan Hajnoczi
On Sun, Dec 15, 2013 at 04:20:05PM +0800, Qingshu Chen wrote:
 hi,I want to calculate the latencyof virtio in kvm, I read the document in
 http://www.linux-kvm.org/page/Virtio/Block/Latency and met some problem.
 1. when calculating latency in kvm, the document said  kvm_pio and
 kvm_set_irq can be filtered using command 'lspci -vv -nn' and 'cat
 /proc/interrupts' , but I don't know how to use the result to filter kvm_pio
 and kvm_set_irq.

Hi Qingshu,
I wrote that wiki page.  The commands on the wiki are:

  cd /sys/kernel/debug/tracing
  echo 'port == 0xc090' events/kvm/kvm_pio/filter
  echo 'gsi == 26' events/kvm/kvm_set_irq/filter
  echo 1 events/kvm/kvm_pio/enable
  echo 1 events/kvm/kvm_set_irq/enable
  cat trace_pipe /tmp/trace

Here is how you can find the right port and gsi values to filter:

1. The 'kvm_pio' event is used to trace guest-host virtqueue
   notification.  Looking at drivers/virtio/virtio_pci.c reveals that
   the port is ioaddr + VIRTIO_PCI_QUEUE_NOTIFY (16).

   Here is an example:
   # lspci -vv -nn
   00:04.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device 
[1af4:1001]
   [...]
  Region 0: I/O ports at c080 [size=64]

   Therefore you need to filter on 0xc080 + 0x10 = 0xc090 to trace
   VIRTIO_PCI_QUEUE_NOTIFY accesses.

2. The 'kvm_set_irq' event is used to trace host-guest virtqueue
   notifications.

   Find the virtio driver instance:
   # ls /sys/block/vda/device/driver/
   bind  module  unbind  virtio1

   Now look up the virtqueue interrupt for 'virtio1':
   # cat /proc/interrupts
   [...]
   185:  35129  0  0  0  0 0   8760 
 0   PCI-MSI-X  virtio1-requests

   So gsi is 185.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 06/15] lib/x86/smp: Fix compiler warnings

2013-12-17 Thread Jan Kiszka
On 2013-12-17 11:35, Paolo Bonzini wrote:
 Il 16/12/2013 10:57, Jan Kiszka ha scritto:
 Add missing include of desc.h for prototypes of setup_idt and
 set_idt_entry and cast away the volatile of ipi_data - it's not volatile
 while we run the IPI handler.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 
 The right fix is to change the declaration from
 
   static volatile void *ipi_data;  // Pointer to volatile void
 
 to
 
   static void *volatile ipi_data;  // Volatile pointer to void

Indeed, find v2 below.

Jan

---8---

Add missing include of desc.h for prototypes of setup_idt and
set_idt_entry and adjust type of ipi_data to what it should actually be:
a volatile pointer the a void.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 lib/x86/smp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/x86/smp.c b/lib/x86/smp.c
index d4c8106..1eb49f2 100644
--- a/lib/x86/smp.c
+++ b/lib/x86/smp.c
@@ -3,6 +3,7 @@
 #include smp.h
 #include apic.h
 #include fwcfg.h
+#include desc.h

 #define IPI_VECTOR 0x20

@@ -10,7 +11,7 @@ typedef void (*ipi_function_type)(void *data);

 static struct spinlock ipi_lock;
 static volatile ipi_function_type ipi_function;
-static volatile void *ipi_data;
+static void *volatile ipi_data;
 static volatile int ipi_done;
 static volatile bool ipi_wait;
 static int _cpu_count;
-- 
1.8.1.1.298.ge7eed54

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Paolo Bonzini
Il 17/12/2013 10:32, Arthur Chunqi Li ha scritto:
 Hi Jan and Paolo,
 
 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.
 
 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

Does the preemption timer testcase pass on your machine?  The preemption
timer is known to have bugs.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] ARM/KVM: save and restore generic timer registers

2013-12-17 Thread Marc Zyngier
On 13/12/13 20:35, Andre Przywara wrote:
 On 12/13/2013 09:10 PM, Christoffer Dall wrote:
 On Fri, Dec 13, 2013 at 02:23:26PM +0100, Andre Przywara wrote:
 For migration to work we need to save (and later restore) the state of
 each core's virtual generic timer.
 Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
 the three needed registers (control, counter, compare value).
 Though they live in cp15 space, we don't use the existing list, since
 they need special accessor functions and the arch timer is optional.

 Signed-off-by: Andre Przywara andre.przyw...@linaro.org
 Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
 ---
 Changes from v1:
 - move code out of coproc.c and into guest.c and arch_timer.c
 - present the registers with their native CP15 addresses, but without
using space in the VCPU's cp15 array
 - do the user space copying in the accessor functions

 Changes from v2:
 - fix compilation without CONFIG_ARCH_TIMER
 - fix compilation for arm64 by defining the appropriate registers there
 - move userspace access out of arch_timer.c into coproc.c
 - Christoffer: removed whitespace in function declaration

 Changes from v3:
 - adapted Marc's SYSREG macro magic from kvmtool for nicer looking code

 Changes from v4:
 - remove ARM64-REG32 type, the ARM ARM defines no 32-bit system registers

   arch/arm/include/asm/kvm_host.h   |  3 ++
   arch/arm/include/uapi/asm/kvm.h   | 20 +
   arch/arm/kvm/guest.c  | 92 
 ++-
   arch/arm64/include/uapi/asm/kvm.h | 18 
   virt/kvm/arm/arch_timer.c | 34 +++
   5 files changed, 166 insertions(+), 1 deletion(-)

 diff --git a/arch/arm/include/asm/kvm_host.h 
 b/arch/arm/include/asm/kvm_host.h
 index 8a6f6db..098f7dd 100644
 --- a/arch/arm/include/asm/kvm_host.h
 +++ b/arch/arm/include/asm/kvm_host.h
 @@ -225,4 +225,7 @@ static inline int 
 kvm_arch_dev_ioctl_check_extension(long ext)
   int kvm_perf_init(void);
   int kvm_perf_teardown(void);

 +u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 +int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 +
   #endif /* __ARM_KVM_HOST_H__ */
 diff --git a/arch/arm/include/uapi/asm/kvm.h 
 b/arch/arm/include/uapi/asm/kvm.h
 index c498b60..835b867 100644
 --- a/arch/arm/include/uapi/asm/kvm.h
 +++ b/arch/arm/include/uapi/asm/kvm.h
 @@ -119,6 +119,26 @@ struct kvm_arch_memory_slot {
   #define KVM_REG_ARM_32_CRN_MASK   0x7800
   #define KVM_REG_ARM_32_CRN_SHIFT  11

 +#define ARM_CP15_REG_SHIFT_MASK(x,n) \
 +   (((x)  KVM_REG_ARM_ ## n ## _SHIFT)  KVM_REG_ARM_ ## n ## _MASK)
 +
 +#define __ARM_CP15_REG(op1,crn,crm,op2) \
 +   (KVM_REG_ARM | (15  KVM_REG_ARM_COPROC_SHIFT) | \
 +   ARM_CP15_REG_SHIFT_MASK(op1, OPC1) | \
 +   ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN) | \
 +   ARM_CP15_REG_SHIFT_MASK(crm, CRM) | \
 +   ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))
 +
 +#define ARM_CP15_REG32(...) (__ARM_CP15_REG(__VA_ARGS__) | 
 KVM_REG_SIZE_U32)
 +
 +#define __ARM_CP15_REG64(op1,crm) \
 +   (__ARM_CP15_REG(op1, 0, crm, 0) | KVM_REG_SIZE_U64)
 +#define ARM_CP15_REG64(...) __ARM_CP15_REG64(__VA_ARGS__)
 +
 +#define KVM_REG_ARM_TIMER_CTL  ARM_CP15_REG32(0, 14, 3, 1)
 +#define KVM_REG_ARM_TIMER_CNT  ARM_CP15_REG64(1, 14)
 +#define KVM_REG_ARM_TIMER_CVAL ARM_CP15_REG64(3, 14)
 +
   /* Normal registers are mapped as coprocessor 16. */
   #define KVM_REG_ARM_CORE  (0x0010  KVM_REG_ARM_COPROC_SHIFT)
   #define KVM_REG_ARM_CORE_REG(name)(offsetof(struct kvm_regs, 
 name) / 4)
 diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
 index 20f8d97..2786eae 100644
 --- a/arch/arm/kvm/guest.c
 +++ b/arch/arm/kvm/guest.c
 @@ -109,6 +109,83 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu 
 *vcpu, struct kvm_regs *regs)
 return -EINVAL;
   }

 +#ifndef CONFIG_KVM_ARM_TIMER
 +
 +#define NUM_TIMER_REGS 0
 +
 +static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 +{
 +   return 0;
 +}
 +
 +static bool is_timer_reg(u64 index)
 +{
 +   return false;
 +}
 +
 +int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
 +{
 +   return 0;
 +}
 +
 +u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 +{
 +   return 0;
 +}
 +
 +#else
 +
 +#define NUM_TIMER_REGS 3
 +
 +static bool is_timer_reg(u64 index)
 +{
 +   switch (index) {
 +   case KVM_REG_ARM_TIMER_CTL:
 +   case KVM_REG_ARM_TIMER_CNT:
 +   case KVM_REG_ARM_TIMER_CVAL:
 +   return true;
 +   }
 +   return false;
 +}
 +
 +static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 +{
 +   if (put_user(KVM_REG_ARM_TIMER_CTL, uindices))
 +   return -EFAULT;
 +   uindices++;
 +   if (put_user(KVM_REG_ARM_TIMER_CNT, uindices))
 +   return -EFAULT;
 +   uindices++;
 +   if (put_user(KVM_REG_ARM_TIMER_CVAL, uindices))
 +   return -EFAULT;
 +
 +   return 0;
 +}
 +
 +#endif
 +
 +static int set_timer_reg(struct 

Re: How to get to know vcpu status from outside

2013-12-17 Thread Paolo Bonzini
Il 17/12/2013 07:11, Arthur Chunqi Li ha scritto:
 Hi Paolo,
 
 Since VCPU is managed the same as a process in kernel, how can I know
 the status (running, sleeping etc.) of a vcpu in kernel? Is there a
 variant in struct kvm_vcpu or something else indicate this?

waitqueue_active(vcpu-wq) means that the VCPU is sleeping in the
kernel (i.e. in a halted state).

vcpu-mode == IN_GUEST_MODE means that the VCPU is running.

Anything else means that the host is running some kind of glue code
(either kernel or userspace).

 Besides, if vcpu1 is running on pcpu1, and a kernel thread running on
 pcpu0. Can the kernel thread send a message to force vcpu1 trap to
 VMM? How can I do this?

Yes, with kvm_vcpu_kick.  KVM tracks internally which pcpu will run the
vcpu in vcpu-cpu, and kvm_vcpu_kick sends either a wakeup (if the vcpu
is sleeping) or an IPI (if it is running).

Paolo

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Jan Kiszka
On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,
 
 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.
 
 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.
 
 Could u help me explain this phenomenon?

Are you on a CPU that has non-broken preemption timer support? Anything
prior Haswell is known to tick with arbitrary frequencies.

BTW, we will have to re-implement preemption timer support with the help
of a regular host timer due to the breakage when halting L2 (see my test
case).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Arthur Chunqi Li
Hi Jan,

On Tue, Dec 17, 2013 at 7:21 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,

 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.

 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

 Could u help me explain this phenomenon?

 Are you on a CPU that has non-broken preemption timer support? Anything
 prior Haswell is known to tick with arbitrary frequencies.

My CPU is Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz.

Besides, what do you mean by arbitrary frequencies?

Arthur

 BTW, we will have to re-implement preemption timer support with the help
 of a regular host timer due to the breakage when halting L2 (see my test
 case).

 Jan

 --
 Siemens AG, Corporate Technology, CT RTC ITP SES-DE
 Corporate Competence Center Embedded Linux



-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to get to know vcpu status from outside

2013-12-17 Thread Arthur Chunqi Li
Hi Paolo,

Thanks very much. And...(see below)

On Tue, Dec 17, 2013 at 7:21 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 17/12/2013 07:11, Arthur Chunqi Li ha scritto:
 Hi Paolo,

 Since VCPU is managed the same as a process in kernel, how can I know
 the status (running, sleeping etc.) of a vcpu in kernel? Is there a
 variant in struct kvm_vcpu or something else indicate this?

 waitqueue_active(vcpu-wq) means that the VCPU is sleeping in the
 kernel (i.e. in a halted state).

 vcpu-mode == IN_GUEST_MODE means that the VCPU is running.

 Anything else means that the host is running some kind of glue code
 (either kernel or userspace).

Another question about scheduler. When I have 4 vcpus and the workload
of VM is low, and I noticed that it tends to activate only 1 or 2
vcpus. Does this mean the other 2 vcpus are scheduled out or into
sleeping status?


 Besides, if vcpu1 is running on pcpu1, and a kernel thread running on
 pcpu0. Can the kernel thread send a message to force vcpu1 trap to
 VMM? How can I do this?

 Yes, with kvm_vcpu_kick.  KVM tracks internally which pcpu will run the
 vcpu in vcpu-cpu, and kvm_vcpu_kick sends either a wakeup (if the vcpu
 is sleeping) or an IPI (if it is running).

What is vcpu's action if kvm_vcpu_kick(vcpu)? What is the exit_reason
of the kicked vcpu?


 Paolo


Besides, can I pin a vcpu to a pcpu? That is to say, I assigned a pcpu
only for a vcpu and pcpu can only run this vcpu?


Thanks,
Arthur

-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to get to know vcpu status from outside

2013-12-17 Thread Paolo Bonzini
Il 17/12/2013 12:43, Arthur Chunqi Li ha scritto:
 Hi Paolo,
 
 Thanks very much. And...(see below)
 
 On Tue, Dec 17, 2013 at 7:21 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 17/12/2013 07:11, Arthur Chunqi Li ha scritto:
 Hi Paolo,

 Since VCPU is managed the same as a process in kernel, how can I know
 the status (running, sleeping etc.) of a vcpu in kernel? Is there a
 variant in struct kvm_vcpu or something else indicate this?

 waitqueue_active(vcpu-wq) means that the VCPU is sleeping in the
 kernel (i.e. in a halted state).

 vcpu-mode == IN_GUEST_MODE means that the VCPU is running.

 Anything else means that the host is running some kind of glue code
 (either kernel or userspace).
 
 Another question about scheduler. When I have 4 vcpus and the workload
 of VM is low, and I noticed that it tends to activate only 1 or 2
 vcpus. Does this mean the other 2 vcpus are scheduled out or into
 sleeping status?

This depends on what the guest scheduler is doing.  The other 2 VCPUs
are probably running for so little time (a few microseconds every
1/100th of a second) that you do not see them, and they stay halted the
rest of the time.

Remember that KVM has no scheduler of its own.  What you see is the
combined result of the guest and host schedulers.

 Besides, if vcpu1 is running on pcpu1, and a kernel thread running on
 pcpu0. Can the kernel thread send a message to force vcpu1 trap to
 VMM? How can I do this?

 Yes, with kvm_vcpu_kick.  KVM tracks internally which pcpu will run the
 vcpu in vcpu-cpu, and kvm_vcpu_kick sends either a wakeup (if the vcpu
 is sleeping) or an IPI (if it is running).
 
 What is vcpu's action if kvm_vcpu_kick(vcpu)? What is the exit_reason
 of the kicked vcpu?

No exit reason, you just get a lightweight exit to the host kernel.  If
you want a userspace exit, you'd need to set a bit in vcpu-requests
before kvm_vcpu_kick (which you can do best with kvm_make_request), and
change that to a userspace exit in vcpu_enter_guest.  There's already an
example of that, search arch/x86/kvm/x86.c for KVM_REQ_TRIPLE_FAULT.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Jan Kiszka
On 2013-12-17 12:31, Arthur Chunqi Li wrote:
 Hi Jan,
 
 On Tue, Dec 17, 2013 at 7:21 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,

 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.

 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

 Could u help me explain this phenomenon?

 Are you on a CPU that has non-broken preemption timer support? Anything
 prior Haswell is known to tick with arbitrary frequencies.
 
 My CPU is Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz.

Hmm, this one seems unaffected. Didn't find a specification update.
Just like Paolo asked: Your original test case passes?

 
 Besides, what do you mean by arbitrary frequencies?

On older CPUs, the tick rate of the preemption timer does not correlate
with the TSC, definitely not in the way the spec defined.

Back to your original question: Are we talking about native use of the
preemption timer via a patched KVM or nested use inside a KVM virtual
machine?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Arthur Chunqi Li
On Tue, Dec 17, 2013 at 8:43 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 12:31, Arthur Chunqi Li wrote:
 Hi Jan,

 On Tue, Dec 17, 2013 at 7:21 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,

 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.

 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

 Could u help me explain this phenomenon?

 Are you on a CPU that has non-broken preemption timer support? Anything
 prior Haswell is known to tick with arbitrary frequencies.

 My CPU is Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz.

 Hmm, this one seems unaffected. Didn't find a specification update.
 Just like Paolo asked: Your original test case passes?


 Besides, what do you mean by arbitrary frequencies?

 On older CPUs, the tick rate of the preemption timer does not correlate
 with the TSC, definitely not in the way the spec defined.

 Back to your original question: Are we talking about native use of the
 preemption timer via a patched KVM or nested use inside a KVM virtual
 machine?

It is about the native use. I think it may due to the scheduling. When
vcpu is scheduled out of pcpu, will the preemption timer work still?

Oh, another problem, I use the released kernel 3.11, not the latest
one. Does this matter?

Arthur


 Jan

 --
 Siemens AG, Corporate Technology, CT RTC ITP SES-DE
 Corporate Competence Center Embedded Linux



-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] KVM: nVMX: Leave VMX mode on apparent CPU reset

2013-12-17 Thread Paolo Bonzini
Il 16/12/2013 10:32, Jan Kiszka ha scritto:
 As long as we do not expose all the VMX related states to user space,
 there is no way to properly reset a VCPU when VMX is enabled. Emulate
 this for now by catching host-side clearings of the feature control MSR.
 This allows to reboot a VM while it is running some hypervisor code.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
 
 Better ideas? Or continue to leave it as it is?

The final vmx_vcpu_reset is the only really ugly part, but it is
_really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
vmx_reset_nested to just

if (is_guest_mode(vcpu))
nested_vmx_vmexit(vcpu);

free_nested(vmx);

?

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] KVM: nVMX: Leave VMX mode on apparent CPU reset

2013-12-17 Thread Jan Kiszka
On 2013-12-17 14:25, Paolo Bonzini wrote:
 Il 16/12/2013 10:32, Jan Kiszka ha scritto:
 As long as we do not expose all the VMX related states to user space,
 there is no way to properly reset a VCPU when VMX is enabled. Emulate
 this for now by catching host-side clearings of the feature control MSR.
 This allows to reboot a VM while it is running some hypervisor code.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---

 Better ideas? Or continue to leave it as it is?
 
 The final vmx_vcpu_reset is the only really ugly part, but it is
 _really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
 vmx_reset_nested to just
 
   if (is_guest_mode(vcpu))
   nested_vmx_vmexit(vcpu);
 
   free_nested(vmx);
 
 ?

Well, I could make setting of MSR_IA32_FEATURE_CONTROL to 0 an official
clear VMX interface. Then QEMU would have to issue this MSR set
request before doing any other CPU state manipulation. Is that what you
have in mind?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH] KVM: nVMX: Leave VMX mode on apparent CPU reset

2013-12-17 Thread Paolo Bonzini
Il 17/12/2013 15:40, Jan Kiszka ha scritto:
  The final vmx_vcpu_reset is the only really ugly part, but it is
  _really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
  vmx_reset_nested to just
  
 if (is_guest_mode(vcpu))
 nested_vmx_vmexit(vcpu);
  
 free_nested(vmx);
  
  ?
 Well, I could make setting of MSR_IA32_FEATURE_CONTROL to 0 an official
 clear VMX interface. Then QEMU would have to issue this MSR set
 request before doing any other CPU state manipulation. Is that what you
 have in mind?

Yes, that was the idea.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Jan Kiszka
On 2013-12-17 13:59, Arthur Chunqi Li wrote:
 On Tue, Dec 17, 2013 at 8:43 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 12:31, Arthur Chunqi Li wrote:
 Hi Jan,

 On Tue, Dec 17, 2013 at 7:21 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,

 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.

 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

 Could u help me explain this phenomenon?

 Are you on a CPU that has non-broken preemption timer support? Anything
 prior Haswell is known to tick with arbitrary frequencies.

 My CPU is Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz.

 Hmm, this one seems unaffected. Didn't find a specification update.
 Just like Paolo asked: Your original test case passes?


 Besides, what do you mean by arbitrary frequencies?

 On older CPUs, the tick rate of the preemption timer does not correlate
 with the TSC, definitely not in the way the spec defined.

 Back to your original question: Are we talking about native use of the
 preemption timer via a patched KVM or nested use inside a KVM virtual
 machine?
 
 It is about the native use. I think it may due to the scheduling. When
 vcpu is scheduled out of pcpu, will the preemption timer work still?

The preemption timer ticks as long as the guest is running. Should be
specified like this as well. So your KVM patch needs to take care of
this when you want to expire it based on real-time, not based on guest
time. That's in fact similar to adjustments you implemented for the
emulation of the preemption timer.

Jan

 
 Oh, another problem, I use the released kernel 3.11, not the latest
 one. Does this matter?
 
 Arthur
 

 Jan

 --
 Siemens AG, Corporate Technology, CT RTC ITP SES-DE
 Corporate Competence Center Embedded Linux
 
 
 

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvmtool: Enable overriding Generic Timer frequency

2013-12-17 Thread Robin Murphy
This patch series allows (but discourages) overriding the Generic Timer
frequency for device tree-based guest OSes, to work around systems with
broken secure firmware that fails to program CNTFRQ correctly.

Robin Murphy (2):
  kvmtool: Support unsigned int options
  kvmtool/arm: Add option to override Generic Timer frequency

 tools/kvm/arm/include/arm-common/kvm-config-arch.h |   15 ++-
 tools/kvm/arm/timer.c  |2 ++
 tools/kvm/include/kvm/parse-options.h  |9 +
 3 files changed, 21 insertions(+), 5 deletions(-)

-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvmtool: Support unsigned int options

2013-12-17 Thread Robin Murphy
Add support for unsigned int command-line options by implementing the
OPT_UINTEGER macro.

Signed-off-by: Robin Murphy robin.mur...@arm.com
Acked-by: Will Deacon will.dea...@arm.com
---
 tools/kvm/include/kvm/parse-options.h |9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/kvm/include/kvm/parse-options.h 
b/tools/kvm/include/kvm/parse-options.h
index 09a5fca..b03f151 100644
--- a/tools/kvm/include/kvm/parse-options.h
+++ b/tools/kvm/include/kvm/parse-options.h
@@ -109,6 +109,15 @@ struct option {
.help = (h) \
 }
 
+#define OPT_UINTEGER(s, l, v, h)\
+{   \
+   .type = OPTION_UINTEGER,\
+   .short_name = (s),  \
+   .long_name = (l),   \
+   .value = check_vtype(v, unsigned int *), \
+   .help = (h) \
+}
+
 #define OPT_U64(s, l, v, h) \
 {   \
.type = OPTION_U64, \
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH uq/master] kvm: x86: Separately write feature control MSR on reset

2013-12-17 Thread Jan Kiszka
If the guest is running in nested mode on system reset, clearing the
feature MSR signals the kernel to leave this mode. Recent kernels
processes this properly, but leave the VCPU state undefined behind. It
is the job of userspace to bring it to a proper shape. Therefore, write
this specific MSR first so that no state transfer gets lost.

This allows to cleanly reset a guest with VMX in use.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 target-i386/kvm.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 1188482..ec51447 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1104,6 +1104,25 @@ static int kvm_put_tscdeadline_msr(X86CPU *cpu)
 return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, msr_data);
 }
 
+/*
+ * Provide a separate write service for the feature control MSR in order to
+ * kick the VCPU out of VMXON or even guest mode on reset. This has to be done
+ * before writing any other state because forcibly leaving nested mode
+ * invalidates the VCPU state.
+ */
+static int kvm_put_msr_feature_control(X86CPU *cpu)
+{
+struct {
+struct kvm_msrs info;
+struct kvm_msr_entry entry;
+} msr_data;
+
+kvm_msr_entry_set(msr_data.entry, MSR_IA32_FEATURE_CONTROL,
+  cpu-env.msr_ia32_feature_control);
+msr_data.info.nmsrs = 1;
+return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, msr_data);
+}
+
 static int kvm_put_msrs(X86CPU *cpu, int level)
 {
 CPUX86State *env = cpu-env;
@@ -1204,10 +1223,8 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 if (cpu-hyperv_vapic) {
 kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0);
 }
-if (has_msr_feature_control) {
-kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL,
-  env-msr_ia32_feature_control);
-}
+/* Note: MSR_IA32_FEATURE_CONTROL is written separately, see
+ *   kvm_put_msr_feature_control. */
 }
 if (env-mcg_cap) {
 int i;
@@ -1801,6 +1818,13 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
 assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
+if (level = KVM_PUT_RESET_STATE  has_msr_feature_control) {
+ret = kvm_put_msr_feature_control(x86_cpu);
+if (ret  0) {
+return ret;
+}
+}
+
 ret = kvm_getput_regs(x86_cpu, 1);
 if (ret  0) {
 return ret;
-- 
1.8.1.1.298.ge7eed54
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: nVMX: Leave VMX mode on clearing of feature control MSR

2013-12-17 Thread Jan Kiszka
When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave
root and non-root mode, fully disabling VMX. The register state of the
VCPU is undefined after this step, so userspace has to set it to a
proper state afterward.

This enables to reboot a VM while it is running some hypervisor code.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---

Even without a QEMU patch, this already enables system reset - the guest
is left in such a broken state that it simply triple-faults and resets
twice. Nevertheless, QEMU patch will follow.

 arch/x86/kvm/vmx.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f90320b..6a0c2fa 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2455,6 +2455,8 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 
msr_index, u64 *pdata)
return 1;
 }
 
+static void vmx_leave_nested(struct kvm_vcpu *vcpu);
+
 static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
u32 msr_index = msr_info-index;
@@ -2470,6 +2472,8 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
 FEATURE_CONTROL_LOCKED)
return 0;
to_vmx(vcpu)-nested.msr_ia32_feature_control = data;
+   if (host_initialized  data == 0)
+   vmx_leave_nested(vcpu);
return 1;
}
 
@@ -8488,6 +8492,16 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu)
 }
 
 /*
+ * Forcibly leave nested mode in order to be able to reset the VCPU later on.
+ */
+static void vmx_leave_nested(struct kvm_vcpu *vcpu)
+{
+   if (is_guest_mode(vcpu))
+   nested_vmx_vmexit(vcpu);
+   free_nested(to_vmx(vcpu));
+}
+
+/*
  * L1's failure to enter L2 is a subset of a normal exit, as explained in
  * 23.7 VM-entry failures during or after loading guest state (this also
  * lists the acceptable exit-reason and exit-qualification parameters).
-- 
1.8.1.1.298.ge7eed54
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvmtool/arm: Add option to override Generic Timer frequency

2013-12-17 Thread Robin Murphy
Some platforms have secure firmware which does not correctly set the
CNTFRQ register on boot, preventing the use of the Generic Timer.
This patch allows mirroring the necessary host workaround by specifying
the clock-frequency property in the guest DT.

This should only be considered a means of KVM bring-up on such systems,
such that vendors may be convinced to properly implement their firmware
to support the virtualisation capabilities of their hardware.

Signed-off-by: Robin Murphy robin.mur...@arm.com
Acked-by: Will Deacon will.dea...@arm.com
---
 tools/kvm/arm/include/arm-common/kvm-config-arch.h |   15 ++-
 tools/kvm/arm/timer.c  |2 ++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/kvm/arm/include/arm-common/kvm-config-arch.h 
b/tools/kvm/arm/include/arm-common/kvm-config-arch.h
index 7ac6f6e..f3baf39 100644
--- a/tools/kvm/arm/include/arm-common/kvm-config-arch.h
+++ b/tools/kvm/arm/include/arm-common/kvm-config-arch.h
@@ -5,13 +5,18 @@
 
 struct kvm_config_arch {
const char *dump_dtb_filename;
+   unsigned int force_cntfrq;
bool aarch32_guest;
 };
 
-#define OPT_ARCH_RUN(pfx, cfg) \
-   pfx,\
-   ARM_OPT_ARCH_RUN(cfg)   \
-   OPT_STRING('\0', dump-dtb, (cfg)-dump_dtb_filename, \
-  .dtb file, Dump generated .dtb to specified file),
+#define OPT_ARCH_RUN(pfx, cfg) 
\
+   pfx,
\
+   ARM_OPT_ARCH_RUN(cfg)   
\
+   OPT_STRING('\0', dump-dtb, (cfg)-dump_dtb_filename, 
\
+  .dtb file, Dump generated .dtb to specified file),   
\
+   OPT_UINTEGER('\0', override-bad-firmware-cntfrq, 
(cfg)-force_cntfrq,\
+Specify Generic Timer frequency in guest DT to   
\
+work around buggy secure firmware *Firmware should be
\
+updated to program CNTFRQ correctly*),
 
 #endif /* ARM_COMMON__KVM_CONFIG_ARCH_H */
diff --git a/tools/kvm/arm/timer.c b/tools/kvm/arm/timer.c
index bd6a0bb..d757c1d 100644
--- a/tools/kvm/arm/timer.c
+++ b/tools/kvm/arm/timer.c
@@ -33,6 +33,8 @@ void timer__generate_fdt_nodes(void *fdt, struct kvm *kvm, 
int *irqs)
_FDT(fdt_begin_node(fdt, timer));
_FDT(fdt_property(fdt, compatible, compatible, sizeof(compatible)));
_FDT(fdt_property(fdt, interrupts, irq_prop, sizeof(irq_prop)));
+   if (kvm-cfg.arch.force_cntfrq  0)
+   _FDT(fdt_property_cell(fdt, clock-frequency, 
kvm-cfg.arch.force_cntfrq));
_FDT(fdt_end_node(fdt));
 }
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] ARM/KVM: save and restore generic timer registers

2013-12-17 Thread Christoffer Dall
On Tue, Dec 17, 2013 at 11:20:20AM +, Marc Zyngier wrote:
 On 13/12/13 20:35, Andre Przywara wrote:
  On 12/13/2013 09:10 PM, Christoffer Dall wrote:
  On Fri, Dec 13, 2013 at 02:23:26PM +0100, Andre Przywara wrote:
  For migration to work we need to save (and later restore) the state of
  each core's virtual generic timer.
  Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
  the three needed registers (control, counter, compare value).
  Though they live in cp15 space, we don't use the existing list, since
  they need special accessor functions and the arch timer is optional.
 
  Signed-off-by: Andre Przywara andre.przyw...@linaro.org
  Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
  ---
  Changes from v1:
  - move code out of coproc.c and into guest.c and arch_timer.c
  - present the registers with their native CP15 addresses, but without
 using space in the VCPU's cp15 array
  - do the user space copying in the accessor functions
 
  Changes from v2:
  - fix compilation without CONFIG_ARCH_TIMER
  - fix compilation for arm64 by defining the appropriate registers there
  - move userspace access out of arch_timer.c into coproc.c
  - Christoffer: removed whitespace in function declaration
 
  Changes from v3:
  - adapted Marc's SYSREG macro magic from kvmtool for nicer looking code
 
  Changes from v4:
  - remove ARM64-REG32 type, the ARM ARM defines no 32-bit system registers
 
arch/arm/include/asm/kvm_host.h   |  3 ++
arch/arm/include/uapi/asm/kvm.h   | 20 +
arch/arm/kvm/guest.c  | 92 
  ++-
arch/arm64/include/uapi/asm/kvm.h | 18 
virt/kvm/arm/arch_timer.c | 34 +++
5 files changed, 166 insertions(+), 1 deletion(-)
 
  diff --git a/arch/arm/include/asm/kvm_host.h 
  b/arch/arm/include/asm/kvm_host.h
  index 8a6f6db..098f7dd 100644
  --- a/arch/arm/include/asm/kvm_host.h
  +++ b/arch/arm/include/asm/kvm_host.h
  @@ -225,4 +225,7 @@ static inline int 
  kvm_arch_dev_ioctl_check_extension(long ext)
int kvm_perf_init(void);
int kvm_perf_teardown(void);
 
  +u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
  +int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
  +
#endif /* __ARM_KVM_HOST_H__ */
  diff --git a/arch/arm/include/uapi/asm/kvm.h 
  b/arch/arm/include/uapi/asm/kvm.h
  index c498b60..835b867 100644
  --- a/arch/arm/include/uapi/asm/kvm.h
  +++ b/arch/arm/include/uapi/asm/kvm.h
  @@ -119,6 +119,26 @@ struct kvm_arch_memory_slot {
#define KVM_REG_ARM_32_CRN_MASK 0x7800
#define KVM_REG_ARM_32_CRN_SHIFT11
 
  +#define ARM_CP15_REG_SHIFT_MASK(x,n) \
  + (((x)  KVM_REG_ARM_ ## n ## _SHIFT)  KVM_REG_ARM_ ## n ## _MASK)
  +
  +#define __ARM_CP15_REG(op1,crn,crm,op2) \
  + (KVM_REG_ARM | (15  KVM_REG_ARM_COPROC_SHIFT) | \
  + ARM_CP15_REG_SHIFT_MASK(op1, OPC1) | \
  + ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN) | \
  + ARM_CP15_REG_SHIFT_MASK(crm, CRM) | \
  + ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))
  +
  +#define ARM_CP15_REG32(...) (__ARM_CP15_REG(__VA_ARGS__) | 
  KVM_REG_SIZE_U32)
  +
  +#define __ARM_CP15_REG64(op1,crm) \
  + (__ARM_CP15_REG(op1, 0, crm, 0) | KVM_REG_SIZE_U64)
  +#define ARM_CP15_REG64(...) __ARM_CP15_REG64(__VA_ARGS__)
  +
  +#define KVM_REG_ARM_TIMER_CTLARM_CP15_REG32(0, 14, 3, 1)
  +#define KVM_REG_ARM_TIMER_CNTARM_CP15_REG64(1, 14)
  +#define KVM_REG_ARM_TIMER_CVAL   ARM_CP15_REG64(3, 14)
  +
/* Normal registers are mapped as coprocessor 16. */
#define KVM_REG_ARM_CORE(0x0010  
  KVM_REG_ARM_COPROC_SHIFT)
#define KVM_REG_ARM_CORE_REG(name)  (offsetof(struct kvm_regs, 
  name) / 4)
  diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
  index 20f8d97..2786eae 100644
  --- a/arch/arm/kvm/guest.c
  +++ b/arch/arm/kvm/guest.c
  @@ -109,6 +109,83 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu 
  *vcpu, struct kvm_regs *regs)
return -EINVAL;
}
 
  +#ifndef CONFIG_KVM_ARM_TIMER
  +
  +#define NUM_TIMER_REGS 0
  +
  +static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user 
  *uindices)
  +{
  + return 0;
  +}
  +
  +static bool is_timer_reg(u64 index)
  +{
  + return false;
  +}
  +
  +int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
  +{
  + return 0;
  +}
  +
  +u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
  +{
  + return 0;
  +}
  +
  +#else
  +
  +#define NUM_TIMER_REGS 3
  +
  +static bool is_timer_reg(u64 index)
  +{
  + switch (index) {
  + case KVM_REG_ARM_TIMER_CTL:
  + case KVM_REG_ARM_TIMER_CNT:
  + case KVM_REG_ARM_TIMER_CVAL:
  + return true;
  + }
  + return false;
  +}
  +
  +static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user 
  *uindices)
  +{
  + if (put_user(KVM_REG_ARM_TIMER_CTL, uindices))
  + return -EFAULT;
  + uindices++;
  + if (put_user(KVM_REG_ARM_TIMER_CNT, uindices))
  + return 

Re: [PATCH 2/2] kvmtool/arm: Add option to override Generic Timer frequency

2013-12-17 Thread Alexander Graf

On 17.12.2013, at 19:31, Robin Murphy robin.mur...@arm.com wrote:

 Some platforms have secure firmware which does not correctly set the
 CNTFRQ register on boot, preventing the use of the Generic Timer.
 This patch allows mirroring the necessary host workaround by specifying
 the clock-frequency property in the guest DT.
 
 This should only be considered a means of KVM bring-up on such systems,
 such that vendors may be convinced to properly implement their firmware
 to support the virtualisation capabilities of their hardware.
 
 Signed-off-by: Robin Murphy robin.mur...@arm.com
 Acked-by: Will Deacon will.dea...@arm.com

How does it encourage a vendor to properly implement their firmware if there's 
a workaround?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread R
Hi,

You must adjust the preemption timer according to the eclipsed time
when the guest runs.

And register a hrtimer triggered when a guest exit.
The timer should be alarmed when the remaining time runs out,

2013/12/17 Jan Kiszka jan.kis...@siemens.com:
 On 2013-12-17 13:59, Arthur Chunqi Li wrote:
 On Tue, Dec 17, 2013 at 8:43 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 12:31, Arthur Chunqi Li wrote:
 Hi Jan,

 On Tue, Dec 17, 2013 at 7:21 PM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2013-12-17 10:32, Arthur Chunqi Li wrote:
 Hi Jan and Paolo,

 I've tried to use preemption timer in KVM to trap vcpu regularly, but
 there's something unexpected. I run a VM with 4 vcpus and give them
 the same preemption timer value (e.g. 100) with all bits set
 (activate/save bits), then reset the value in preemption time-out
 handler.

 Thus I expected these vcpus trap regularly in some special turns. But
 I found that when the VM is not busy, some vcpus are trapped much less
 frequently than others. In Intel SDM, I noticed that preemption timer
 is only related to TSC, and I think all the vcpus should trap in a
 similar frequency.

 Could u help me explain this phenomenon?

 Are you on a CPU that has non-broken preemption timer support? Anything
 prior Haswell is known to tick with arbitrary frequencies.

 My CPU is Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz.

 Hmm, this one seems unaffected. Didn't find a specification update.
 Just like Paolo asked: Your original test case passes?


 Besides, what do you mean by arbitrary frequencies?

 On older CPUs, the tick rate of the preemption timer does not correlate
 with the TSC, definitely not in the way the spec defined.

 Back to your original question: Are we talking about native use of the
 preemption timer via a patched KVM or nested use inside a KVM virtual
 machine?

 It is about the native use. I think it may due to the scheduling. When
 vcpu is scheduled out of pcpu, will the preemption timer work still?

 The preemption timer ticks as long as the guest is running. Should be
 specified like this as well. So your KVM patch needs to take care of
 this when you want to expire it based on real-time, not based on guest
 time. That's in fact similar to adjustments you implemented for the
 emulation of the preemption timer.

 Jan


 Oh, another problem, I use the released kernel 3.11, not the latest
 one. Does this matter?

 Arthur


 Jan

 --
 Siemens AG, Corporate Technology, CT RTC ITP SES-DE
 Corporate Competence Center Embedded Linux




 --
 Siemens AG, Corporate Technology, CT RTC ITP SES-DE
 Corporate Competence Center Embedded Linux
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Thanks
Rui Wu
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive

2013-12-17 Thread Christoffer Dall
The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
waitqueue for the corresponding CPU is not active.  This does not seem
correct, since KVM should not care what the specific thread is doing,
for example, user space may not have called KVM_RUN on this VCPU yet or
the thread may be busy looping to user space because it received a
signal; this is really up to the user space implementation.  Instead we
should check specifically that the CPU is marked as being turned off,
regardless of the VCPU thread state, and if it is, we shall
simply clear the pause flag on the CPU and wake up the thread if it
happens to be blocked for us.

Further, the implementation seems to be racy when executing multiple
VCPU threads.  There really isn't a reasonable user space programming
scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
before turning on the boot CPU.

Therefore, set the pause flag on the vcpu at VCPU init time (which can
reasonably be expected to be completed for all CPUs by user space before
running any VCPUs) and clear both this flag and the feature (in case the
feature can somehow get set again in the future) and ping the waitqueue
on turning on a VCPU using PSCI.

Reported-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
Changes[v2]:
 - Use non-atomic version of test_and_clear_bit instead
 - Check if vcpu is paused and return KVM_PSCI_RET_INVAL if not
 - Remove unnecessary feature bit clear

 arch/arm/kvm/arm.c  | 30 +++---
 arch/arm/kvm/psci.c | 11 ++-
 2 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 2a700e0..151eb91 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -478,15 +478,6 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
return ret;
}
 
-   /*
-* Handle the start in power-off case by calling into the
-* PSCI code.
-*/
-   if (test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features)) {
-   *vcpu_reg(vcpu, 0) = KVM_PSCI_FN_CPU_OFF;
-   kvm_psci_call(vcpu);
-   }
-
return 0;
 }
 
@@ -700,6 +691,24 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct 
kvm_irq_level *irq_level,
return -EINVAL;
 }
 
+static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
+struct kvm_vcpu_init *init)
+{
+   int ret;
+
+   ret = kvm_vcpu_set_target(vcpu, init);
+   if (ret)
+   return ret;
+
+   /*
+* Handle the start in power-off case by marking the VCPU as paused.
+*/
+   if (__test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features))
+   vcpu-arch.pause = true;
+
+   return 0;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg)
 {
@@ -713,8 +722,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
if (copy_from_user(init, argp, sizeof(init)))
return -EFAULT;
 
-   return kvm_vcpu_set_target(vcpu, init);
-
+   return kvm_arch_vcpu_ioctl_vcpu_init(vcpu, init);
}
case KVM_SET_ONE_REG:
case KVM_GET_ONE_REG: {
diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 0881bf1..448f60e 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -54,15 +54,15 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
}
}
 
-   if (!vcpu)
+   /*
+* Make sure the caller requested a valid CPU and that the CPU is
+* turned off.
+*/
+   if (!vcpu || !vcpu-arch.pause)
return KVM_PSCI_RET_INVAL;
 
target_pc = *vcpu_reg(source_vcpu, 2);
 
-   wq = kvm_arch_vcpu_wq(vcpu);
-   if (!waitqueue_active(wq))
-   return KVM_PSCI_RET_INVAL;
-
kvm_reset_vcpu(vcpu);
 
/* Gracefully handle Thumb2 entry point */
@@ -79,6 +79,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
vcpu-arch.pause = false;
smp_mb();   /* Make sure the above is visible */
 
+   wq = kvm_arch_vcpu_wq(vcpu);
wake_up_interruptible(wq);
 
return KVM_PSCI_RET_SUCCESS;
-- 
1.8.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] KVM: arm/arm64: Clarify KVM_ARM_VCPU_INIT api

2013-12-17 Thread Christoffer Dall
There is nothing technically or semantically wrong with calling
KVM_ARM_VCPU_INIT more than once, and even calling this on a VCPU after
the VCPU has been executed.  It just happens that user space will need a
way to reset the VCPU or put the VCPU back in PSCI power-off mode after
the VM has run, for example when driving a system reset from user space.

Clarify that it is perfectly fine to use this API for that purpose.

Cc: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
 Documentation/virtual/kvm/api.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index aad3244..d813a61 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2341,6 +2341,11 @@ return ENOEXEC for that vcpu.
 Note that because some registers reflect machine topology, all vcpus
 should be created before this ioctl is invoked.
 
+Calling this a second time on a VCPU will reset the cpu registers to
+their initial values and can be used with the feature bits to change the
+CPU state, for example to put the CPU into power off mode from user
+space.
+
 Possible features:
- KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
  Depends on KVM_CAP_ARM_PSCI.
-- 
1.8.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: About preemption timer

2013-12-17 Thread Jan Kiszka
On 2013-12-18 04:27, R wrote:
 Hi,
 
 You must adjust the preemption timer according to the eclipsed time
 when the guest runs.
 
 And register a hrtimer triggered when a guest exit.
 The timer should be alarmed when the remaining time runs out,

In fact, as we need to register an hrtimer anyway when leaving the
guest, we can simply register it always and stop using the physical
preemption timer. This will also solve the breakage on older Intel CPU,
actually it will add preemption timer support unconditionally.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc: book3s: kvm: Don't abuse host r2 in exit path

2013-12-17 Thread Aneesh Kumar K.V

Hi Alex,

Any update on this ? We need this to got into 3.13.

-aneesh 

Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

 We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
 KVM mode when both HV and PR are enabled in the kernel. Without this we
 get the below crash

 (qemu)
 Unable to handle kernel paging request for data at address 0x8310
 Faulting instruction address: 0xc001d5a4
 cpu 0x2: Vector: 300 (Data Access) at [c001dc53aef0]
 pc: c001d5a4: .vtime_delta.isra.1+0x34/0x1d0
 lr: c001d760: .vtime_account_system+0x20/0x60
 sp: c001dc53b170
msr: 80009032
dar: 8310
  dsisr: 4000
   current = 0xc001d76c62d0
   paca= 0xcfef1100   softe: 0irq_happened: 0x01
 pid   = 4472, comm = qemu-system-ppc
 enter ? for help
 [c001dc53b200] c001d760 .vtime_account_system+0x20/0x60
 [c001dc53b290] c008d050 .kvmppc_handle_exit_pr+0x60/0xa50
 [c001dc53b340] c008f51c kvm_start_lightweight+0xb4/0xc4
 [c001dc53b510] c008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
 [c001dc53b9e0] c008341c .kvmppc_vcpu_run+0x2c/0x40
 [c001dc53ba50] c0080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
 [c001dc53bae0] c007b4c8 .kvm_vcpu_ioctl+0x478/0x730
 [c001dc53bca0] c02140cc .do_vfs_ioctl+0x4ac/0x770
 [c001dc53bd80] c02143e8 .SyS_ioctl+0x58/0xb0
 [c001dc53be30] c0009e58 syscall_exit+0x0/0x98
 --- Exception: c00 (System Call) at 1f960160
 SP (1ecbe3c0) is in userspace

 These changes were originally part of
 http://mid.gmane.org/20130806042205.gr19...@iris.ozlabs.ibm.com

 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
  arch/powerpc/include/asm/kvm_book3s_asm.h | 1 +
  arch/powerpc/kernel/asm-offsets.c | 1 +
  arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 7 +++
  3 files changed, 5 insertions(+), 4 deletions(-)

 diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
 b/arch/powerpc/include/asm/kvm_book3s_asm.h
 index 0bd9348..69fe837 100644
 --- a/arch/powerpc/include/asm/kvm_book3s_asm.h
 +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
 @@ -79,6 +79,7 @@ struct kvmppc_host_state {
   ulong vmhandler;
   ulong scratch0;
   ulong scratch1;
 + ulong scratch2;
   u8 in_guest;
   u8 restore_hid5;
   u8 napping;
 diff --git a/arch/powerpc/kernel/asm-offsets.c 
 b/arch/powerpc/kernel/asm-offsets.c
 index 8e6ede6..841a4c8 100644
 --- a/arch/powerpc/kernel/asm-offsets.c
 +++ b/arch/powerpc/kernel/asm-offsets.c
 @@ -583,6 +583,7 @@ int main(void)
   HSTATE_FIELD(HSTATE_VMHANDLER, vmhandler);
   HSTATE_FIELD(HSTATE_SCRATCH0, scratch0);
   HSTATE_FIELD(HSTATE_SCRATCH1, scratch1);
 + HSTATE_FIELD(HSTATE_SCRATCH2, scratch2);
   HSTATE_FIELD(HSTATE_IN_GUEST, in_guest);
   HSTATE_FIELD(HSTATE_RESTORE_HID5, restore_hid5);
   HSTATE_FIELD(HSTATE_NAPPING, napping);
 diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
 b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 index 339aa5e..16f7654 100644
 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 @@ -750,15 +750,14 @@ kvmppc_interrupt_hv:
* guest CR, R12 saved in shadow VCPU SCRATCH1/0
* guest R13 saved in SPRN_SCRATCH0
*/
 - /* abuse host_r2 as third scratch area; we get r2 from PACATOC(r13) */
 - std r9, HSTATE_HOST_R2(r13)
 + std r9, HSTATE_SCRATCH2(r13)
  
   lbz r9, HSTATE_IN_GUEST(r13)
   cmpwi   r9, KVM_GUEST_MODE_HOST_HV
   beq kvmppc_bad_host_intr
  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
   cmpwi   r9, KVM_GUEST_MODE_GUEST
 - ld  r9, HSTATE_HOST_R2(r13)
 + ld  r9, HSTATE_SCRATCH2(r13)
   beq kvmppc_interrupt_pr
  #endif
   /* We're now back in the host but in guest MMU context */
 @@ -778,7 +777,7 @@ kvmppc_interrupt_hv:
   std r6, VCPU_GPR(R6)(r9)
   std r7, VCPU_GPR(R7)(r9)
   std r8, VCPU_GPR(R8)(r9)
 - ld  r0, HSTATE_HOST_R2(r13)
 + ld  r0, HSTATE_SCRATCH2(r13)
   std r0, VCPU_GPR(R9)(r9)
   std r10, VCPU_GPR(R10)(r9)
   std r11, VCPU_GPR(R11)(r9)
 -- 
 1.8.3.2

 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc: book3s: kvm: Don't abuse host r2 in exit path

2013-12-17 Thread Aneesh Kumar K.V

Hi Alex,

Any update on this ? We need this to got into 3.13.

-aneesh 

Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

 We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
 KVM mode when both HV and PR are enabled in the kernel. Without this we
 get the below crash

 (qemu)
 Unable to handle kernel paging request for data at address 0x8310
 Faulting instruction address: 0xc001d5a4
 cpu 0x2: Vector: 300 (Data Access) at [c001dc53aef0]
 pc: c001d5a4: .vtime_delta.isra.1+0x34/0x1d0
 lr: c001d760: .vtime_account_system+0x20/0x60
 sp: c001dc53b170
msr: 80009032
dar: 8310
  dsisr: 4000
   current = 0xc001d76c62d0
   paca= 0xcfef1100   softe: 0irq_happened: 0x01
 pid   = 4472, comm = qemu-system-ppc
 enter ? for help
 [c001dc53b200] c001d760 .vtime_account_system+0x20/0x60
 [c001dc53b290] c008d050 .kvmppc_handle_exit_pr+0x60/0xa50
 [c001dc53b340] c008f51c kvm_start_lightweight+0xb4/0xc4
 [c001dc53b510] c008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
 [c001dc53b9e0] c008341c .kvmppc_vcpu_run+0x2c/0x40
 [c001dc53ba50] c0080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
 [c001dc53bae0] c007b4c8 .kvm_vcpu_ioctl+0x478/0x730
 [c001dc53bca0] c02140cc .do_vfs_ioctl+0x4ac/0x770
 [c001dc53bd80] c02143e8 .SyS_ioctl+0x58/0xb0
 [c001dc53be30] c0009e58 syscall_exit+0x0/0x98
 --- Exception: c00 (System Call) at 1f960160
 SP (1ecbe3c0) is in userspace

 These changes were originally part of
 http://mid.gmane.org/20130806042205.gr19...@iris.ozlabs.ibm.com

 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
  arch/powerpc/include/asm/kvm_book3s_asm.h | 1 +
  arch/powerpc/kernel/asm-offsets.c | 1 +
  arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 7 +++
  3 files changed, 5 insertions(+), 4 deletions(-)

 diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
 b/arch/powerpc/include/asm/kvm_book3s_asm.h
 index 0bd9348..69fe837 100644
 --- a/arch/powerpc/include/asm/kvm_book3s_asm.h
 +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
 @@ -79,6 +79,7 @@ struct kvmppc_host_state {
   ulong vmhandler;
   ulong scratch0;
   ulong scratch1;
 + ulong scratch2;
   u8 in_guest;
   u8 restore_hid5;
   u8 napping;
 diff --git a/arch/powerpc/kernel/asm-offsets.c 
 b/arch/powerpc/kernel/asm-offsets.c
 index 8e6ede6..841a4c8 100644
 --- a/arch/powerpc/kernel/asm-offsets.c
 +++ b/arch/powerpc/kernel/asm-offsets.c
 @@ -583,6 +583,7 @@ int main(void)
   HSTATE_FIELD(HSTATE_VMHANDLER, vmhandler);
   HSTATE_FIELD(HSTATE_SCRATCH0, scratch0);
   HSTATE_FIELD(HSTATE_SCRATCH1, scratch1);
 + HSTATE_FIELD(HSTATE_SCRATCH2, scratch2);
   HSTATE_FIELD(HSTATE_IN_GUEST, in_guest);
   HSTATE_FIELD(HSTATE_RESTORE_HID5, restore_hid5);
   HSTATE_FIELD(HSTATE_NAPPING, napping);
 diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
 b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 index 339aa5e..16f7654 100644
 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 @@ -750,15 +750,14 @@ kvmppc_interrupt_hv:
* guest CR, R12 saved in shadow VCPU SCRATCH1/0
* guest R13 saved in SPRN_SCRATCH0
*/
 - /* abuse host_r2 as third scratch area; we get r2 from PACATOC(r13) */
 - std r9, HSTATE_HOST_R2(r13)
 + std r9, HSTATE_SCRATCH2(r13)
  
   lbz r9, HSTATE_IN_GUEST(r13)
   cmpwi   r9, KVM_GUEST_MODE_HOST_HV
   beq kvmppc_bad_host_intr
  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
   cmpwi   r9, KVM_GUEST_MODE_GUEST
 - ld  r9, HSTATE_HOST_R2(r13)
 + ld  r9, HSTATE_SCRATCH2(r13)
   beq kvmppc_interrupt_pr
  #endif
   /* We're now back in the host but in guest MMU context */
 @@ -778,7 +777,7 @@ kvmppc_interrupt_hv:
   std r6, VCPU_GPR(R6)(r9)
   std r7, VCPU_GPR(R7)(r9)
   std r8, VCPU_GPR(R8)(r9)
 - ld  r0, HSTATE_HOST_R2(r13)
 + ld  r0, HSTATE_SCRATCH2(r13)
   std r0, VCPU_GPR(R9)(r9)
   std r10, VCPU_GPR(R10)(r9)
   std r11, VCPU_GPR(R11)(r9)
 -- 
 1.8.3.2

 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html