Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread tiejun.chen

On 04/02/2013 06:47 AM, Scott Wood wrote:

Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood scottw...@freescale.com
---
  Documentation/virtual/kvm/api.txt|   70 ++
  Documentation/virtual/kvm/devices/README |1 +
  arch/powerpc/include/asm/kvm_host.h  |6 +++
  arch/powerpc/include/asm/kvm_ppc.h   |2 +
  arch/powerpc/kvm/powerpc.c   |7 +++
  include/uapi/linux/kvm.h |   27 
  virt/kvm/kvm_main.c  |   31 +
  7 files changed, 144 insertions(+)
  create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 976eb65..77328aa 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from 
the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.

+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+  be instantiated multiple times
+  ENOSPC: Too many devices have been created
+
+  Other error conditions may be defined by individual device types.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+   __u32   type;   /* in: KVM_DEV_TYPE_xxx */
+   __u32   fd; /* out: device handle */
+   __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+  (e.g. read-only attribute, or attribute that only makes
+  sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the devices directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+   __u32   flags;  /* no flags currently defined */
+   __u32   group;  /* device-defined */
+   __u64   attr;   /* group-defined */
+   __u64   addr;   /* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  addr is ignored.

  4.77 KVM_ARM_VCPU_INIT

diff --git a/Documentation/virtual/kvm/devices/README 
b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 000..34a6983
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/README
@@ -0,0 +1 @@
+This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..e0caae2 100644
--- 

Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-02 Thread Alexander Graf

On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:

 
 
 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 28, 2013 10:06 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 
 On 21.03.2013, at 07:25, Bharat Bhushan wrote:
 
 From: Bharat Bhushan bharat.bhus...@freescale.com
 
 This patch adds the debug stub support on booke/bookehv.
 Now QEMU debug stub can use hw breakpoint, watchpoint and software
 breakpoint to debug guest.
 
 Debug registers are saved/restored on vcpu_put()/vcpu_get().
 Also the debug registers are saved restored only if guest is using
 debug resources.
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 v2:
 - save/restore in vcpu_get()/vcpu_put()
 - some more minor cleanup based on review comments.
 
 arch/powerpc/include/asm/kvm_host.h |   10 ++
 arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
 arch/powerpc/kvm/booke.c|  252 
 ---
 arch/powerpc/kvm/e500_emulate.c |   10 ++
 4 files changed, 272 insertions(+), 22 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_host.h
 b/arch/powerpc/include/asm/kvm_host.h
 index f4ba881..8571952 100644
 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
 u32 mmucfg;
 u32 epr;
 u32 crit_save;
 +   /* guest debug registers*/
 struct kvmppc_booke_debug_reg dbg_reg;
 +   /* shadow debug registers */
 +   struct kvmppc_booke_debug_reg shadow_dbg_reg;
 +   /* host debug registers*/
 +   struct kvmppc_booke_debug_reg host_dbg_reg;
 +   /*
 +* Flag indicating that debug registers are used by guest
 +* and requires save restore.
 +   */
 +   bool debug_save_restore;
 #endif
 gpa_t paddr_accessed;
 gva_t vaddr_accessed;
 diff --git a/arch/powerpc/include/uapi/asm/kvm.h
 b/arch/powerpc/include/uapi/asm/kvm.h
 index 15f9a00..d7ce449 100644
 --- a/arch/powerpc/include/uapi/asm/kvm.h
 +++ b/arch/powerpc/include/uapi/asm/kvm.h
 @@ -25,6 +25,7 @@
 /* Select powerpc specific features in linux/kvm.h */ #define
 __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
 +#define __KVM_HAVE_GUEST_DEBUG
 
 struct kvm_regs {
 __u64 pc;
 @@ -267,7 +268,24 @@ struct kvm_fpu {
 __u64 fpr[32];
 };
 
 +/*
 + * Defines for h/w breakpoint, watchpoint (read, write or both) and
 + * software breakpoint.
 + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
 + * for KVM_DEBUG_EXIT.
 + */
 +#define KVMPPC_DEBUG_NONE  0x0
 +#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
 +#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
 +#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
 struct kvm_debug_exit_arch {
 +   __u64 address;
 +   /*
 +* exiting to userspace because of h/w breakpoint, watchpoint
 +* (read, write or both) and software breakpoint.
 +*/
 +   __u32 status;
 +   __u32 reserved;
 };
 
 /* for KVM_SET_GUEST_DEBUG */
 @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
  * Type denotes h/w breakpoint, read watchpoint, write
  * watchpoint or watchpoint (both read and write).
  */
 -#define KVMPPC_DEBUG_NOTYPE0x0
 -#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
 -#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
 -#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
 __u32 type;
 __u32 reserved;
 } bp[16];
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index
 1de93a8..bf20056 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu
 *vcpu) #endif }
 
 +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
 +   /* Synchronize guest's desire to get debug interrupts into shadow
 +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
 +   vcpu-arch.shadow_msr = ~MSR_DE;
 +   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
 +
 +   /* Force enable debug interrupts when user space wants to debug */
 +   if (vcpu-guest_debug) {
 +#ifdef CONFIG_KVM_BOOKE_HV
 +   /*
 +* Since there is no shadow MSR, sync MSR_DE into the guest
 +* visible MSR. Do not allow guest to change MSR[DE].
 +*/
 +   vcpu-arch.shared-msr |= MSR_DE;
 +   mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP);
 
 This mtspr should really just be a bit or in shadow_mspr when guest_debug 
 gets
 enabled. It should automatically get synchronized as soon as the next
 vpcu_load() happens.
 
 I think this is not required here as shadow_dbsr already have MSRP_DEP set.
 
 Will setup shadow_msrp when setting guest_debug and clear shadow_msrp when 
 guest_debug is cleared.
 But that will also not be sufficient as it not sure when vcpu_load() will be 
 called after the shadow_msrp is changed. So 

RE: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-02 Thread Bhushan Bharat-R65777


 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, April 02, 2013 1:57 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support
 
 
 On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:
 
 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 28, 2013 10:06 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  On 21.03.2013, at 07:25, Bharat Bhushan wrote:
 
  From: Bharat Bhushan bharat.bhus...@freescale.com
 
  This patch adds the debug stub support on booke/bookehv.
  Now QEMU debug stub can use hw breakpoint, watchpoint and software
  breakpoint to debug guest.
 
  Debug registers are saved/restored on vcpu_put()/vcpu_get().
  Also the debug registers are saved restored only if guest is using
  debug resources.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v2:
  - save/restore in vcpu_get()/vcpu_put()
  - some more minor cleanup based on review comments.
 
  arch/powerpc/include/asm/kvm_host.h |   10 ++
  arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
  arch/powerpc/kvm/booke.c|  252 
  -
 --
  arch/powerpc/kvm/e500_emulate.c |   10 ++
  4 files changed, 272 insertions(+), 22 deletions(-)
 
  diff --git a/arch/powerpc/include/asm/kvm_host.h
  b/arch/powerpc/include/asm/kvm_host.h
  index f4ba881..8571952 100644
  --- a/arch/powerpc/include/asm/kvm_host.h
  +++ b/arch/powerpc/include/asm/kvm_host.h
  @@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 epr;
u32 crit_save;
  + /* guest debug registers*/
struct kvmppc_booke_debug_reg dbg_reg;
  + /* shadow debug registers */
  + struct kvmppc_booke_debug_reg shadow_dbg_reg;
  + /* host debug registers*/
  + struct kvmppc_booke_debug_reg host_dbg_reg;
  + /*
  +  * Flag indicating that debug registers are used by guest
  +  * and requires save restore.
  + */
  + bool debug_save_restore;
  #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
  diff --git a/arch/powerpc/include/uapi/asm/kvm.h
  b/arch/powerpc/include/uapi/asm/kvm.h
  index 15f9a00..d7ce449 100644
  --- a/arch/powerpc/include/uapi/asm/kvm.h
  +++ b/arch/powerpc/include/uapi/asm/kvm.h
  @@ -25,6 +25,7 @@
  /* Select powerpc specific features in linux/kvm.h */ #define
  __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
  +#define __KVM_HAVE_GUEST_DEBUG
 
  struct kvm_regs {
__u64 pc;
  @@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
  };
 
  +/*
  + * Defines for h/w breakpoint, watchpoint (read, write or both) and
  + * software breakpoint.
  + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
  + * for KVM_DEBUG_EXIT.
  + */
  +#define KVMPPC_DEBUG_NONE0x0
  +#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  +#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  +#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
  struct kvm_debug_exit_arch {
  + __u64 address;
  + /*
  +  * exiting to userspace because of h/w breakpoint, watchpoint
  +  * (read, write or both) and software breakpoint.
  +  */
  + __u32 status;
  + __u32 reserved;
  };
 
  /* for KVM_SET_GUEST_DEBUG */
  @@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
  -#define KVMPPC_DEBUG_NOTYPE  0x0
  -#define KVMPPC_DEBUG_BREAKPOINT  (1UL  1)
  -#define KVMPPC_DEBUG_WATCH_WRITE (1UL  2)
  -#define KVMPPC_DEBUG_WATCH_READ  (1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
  diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
  index
  1de93a8..bf20056 100644
  --- a/arch/powerpc/kvm/booke.c
  +++ b/arch/powerpc/kvm/booke.c
  @@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct
  kvm_vcpu
  *vcpu) #endif }
 
  +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
  + /* Synchronize guest's desire to get debug interrupts into shadow
  +MSR */ #ifndef CONFIG_KVM_BOOKE_HV
  + vcpu-arch.shadow_msr = ~MSR_DE;
  + vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
  +
  + /* Force enable debug interrupts when user space wants to debug */
  + if (vcpu-guest_debug) {
  +#ifdef CONFIG_KVM_BOOKE_HV
  + /*
  +  * Since there is no shadow MSR, sync MSR_DE into the guest
  +  * visible MSR. Do not allow guest to change MSR[DE].
  +  */
  + vcpu-arch.shared-msr |= MSR_DE;
  + mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP);
 
  This mtspr should really just be a bit or in shadow_mspr when
  guest_debug gets enabled. It should automatically get synchronized as
  soon as the next
  

Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-02 Thread Alexander Graf

On 04/02/2013 04:09 PM, Bhushan Bharat-R65777 wrote:



-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Tuesday, April 02, 2013 1:57 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support


On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:




-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, March 28, 2013 10:06 PM
To: Bhushan Bharat-R65777
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421;
Bhushan
Bharat-R65777
Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
support


On 21.03.2013, at 07:25, Bharat Bhushan wrote:


From: Bharat Bhushanbharat.bhus...@freescale.com

This patch adds the debug stub support on booke/bookehv.
Now QEMU debug stub can use hw breakpoint, watchpoint and software
breakpoint to debug guest.

Debug registers are saved/restored on vcpu_put()/vcpu_get().
Also the debug registers are saved restored only if guest is using
debug resources.

Signed-off-by: Bharat Bhushanbharat.bhus...@freescale.com
---
v2:
- save/restore in vcpu_get()/vcpu_put()
- some more minor cleanup based on review comments.

arch/powerpc/include/asm/kvm_host.h |   10 ++
arch/powerpc/include/uapi/asm/kvm.h |   22 +++-
arch/powerpc/kvm/booke.c|  252 -

--

arch/powerpc/kvm/e500_emulate.c |   10 ++
4 files changed, 272 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h
b/arch/powerpc/include/asm/kvm_host.h
index f4ba881..8571952 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -504,7 +504,17 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 epr;
u32 crit_save;
+   /* guest debug registers*/
struct kvmppc_booke_debug_reg dbg_reg;
+   /* shadow debug registers */
+   struct kvmppc_booke_debug_reg shadow_dbg_reg;
+   /* host debug registers*/
+   struct kvmppc_booke_debug_reg host_dbg_reg;
+   /*
+* Flag indicating that debug registers are used by guest
+* and requires save restore.
+   */
+   bool debug_save_restore;
#endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h
b/arch/powerpc/include/uapi/asm/kvm.h
index 15f9a00..d7ce449 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -25,6 +25,7 @@
/* Select powerpc specific features inlinux/kvm.h  */ #define
__KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT
+#define __KVM_HAVE_GUEST_DEBUG

struct kvm_regs {
__u64 pc;
@@ -267,7 +268,24 @@ struct kvm_fpu {
__u64 fpr[32];
};

+/*
+ * Defines for h/w breakpoint, watchpoint (read, write or both) and
+ * software breakpoint.
+ * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
+ * for KVM_DEBUG_EXIT.
+ */
+#define KVMPPC_DEBUG_NONE  0x0
+#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
+#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
+#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
struct kvm_debug_exit_arch {
+   __u64 address;
+   /*
+* exiting to userspace because of h/w breakpoint, watchpoint
+* (read, write or both) and software breakpoint.
+*/
+   __u32 status;
+   __u32 reserved;
};

/* for KVM_SET_GUEST_DEBUG */
@@ -279,10 +297,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
-#define KVMPPC_DEBUG_NOTYPE0x0
-#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
-#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
-#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index
1de93a8..bf20056 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -133,6 +133,30 @@ static void kvmppc_vcpu_sync_fpu(struct
kvm_vcpu
*vcpu) #endif }

+static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu) {
+   /* Synchronize guest's desire to get debug interrupts into shadow
+MSR */ #ifndef CONFIG_KVM_BOOKE_HV
+   vcpu-arch.shadow_msr= ~MSR_DE;
+   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE; #endif
+
+   /* Force enable debug interrupts when user space wants to debug */
+   if (vcpu-guest_debug) {
+#ifdef CONFIG_KVM_BOOKE_HV
+   /*
+* Since there is no shadow MSR, sync MSR_DE into the guest
+* visible MSR. Do not allow guest to change MSR[DE].
+*/
+   vcpu-arch.shared-msr |= MSR_DE;
+   mtspr(SPRN_MSRP, mfspr(SPRN_MSRP) | MSRP_DEP);

This mtspr should really just be a bit or in shadow_mspr when
guest_debug gets 

Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub support

2013-04-02 Thread Scott Wood

On 04/02/2013 09:09:34 AM, Bhushan Bharat-R65777 wrote:



 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, April 02, 2013 1:57 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421
 Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub  
support



 On 29.03.2013, at 07:04, Bhushan Bharat-R65777 wrote:

 
 
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, March 28, 2013 10:06 PM
  To: Bhushan Bharat-R65777
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood  
Scott-B07421;

  Bhushan
  Bharat-R65777
  Subject: Re: [PATCH 4/4 v2] KVM: PPC: Add userspace debug stub
  support
 
 
  How does the normal debug register switching code work in Linux?
  Can't we just reuse that? Or rely on it to restore working state  
when

  another process gets scheduled in?
 
  Good point, I can see debug registers loading in function  
__switch_to()-

 switch_booke_debug_regs() in file arch/powerpc/kernel/process.c.
  So as long as assume that host will not use debug resources we  
can rely on
 this restore. But I am not sure that this is a fare assumption. As  
Scott earlier

 mentioned someone can use debug resource for kernel debugging also.

 Someone in the kernel can also use floating point registers. But  
then it's his

 responsibility to clean up the mess he leaves behind.

I am neither convinced by what you said and nor even have much reason  
to oppose :)


Scott,
	I remember you mentioned that host can use debug resources, you  
comment on this ?


I thought the conclusion we reached was that it was OK as long as KVM  
waits until it actually needs the debug resources to mess with the  
registers.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread Paul Mackerras
On Mon, Apr 01, 2013 at 05:47:48PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device state,
 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.
 
 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.
 
 Both device types and individual attributes can be tested without having
 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.
 
 Signed-off-by: Scott Wood scottw...@freescale.com

Some comments below...

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 976eb65..77328aa 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
 from the data
  written, then `n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL

I notice this patch doesn't add this capability; you add it in a later
patch.  Since this patch adds the KVM_CREATE_DEVICE ioctl, it probably
should add the KVM_CAP_DEVICE_CTRL capability too.


 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +  ENOSPC: Too many devices have been created

Is this still a possible error code?

 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -370,6 +370,9 @@ struct kvmppc_booke_debug_reg {
   u64 dac[KVMPPC_BOOKE_MAX_DAC];
  };
  
 +#define KVMPPC_IRQCHIP_NONE  0
 +#define KVMPPC_IRQCHIP_MPIC  1

This define should go in the patch that adds the MPIC device.

  struct kvm_vcpu_arch {
   ulong host_stack;
   u32 host_pid;
 @@ -549,6 +552,9 @@ struct kvm_vcpu_arch {
   unsigned long magic_page_pa; /* phys addr to map the magic page to */
   unsigned long magic_page_ea; /* effect. addr to map the magic page to */
  
 + int irqchip_type;
 + void *irqchip_priv;

Since you add this (irqchip_priv) only to remove it in a later patch
and replace it by a device-specific pointer, why bother adding it
here?  And why not give irqchip_type the name it ultimately ends up
with?

 diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
 index 16b4595..bdfa526 100644
 --- a/arch/powerpc/kvm/powerpc.c
 +++ b/arch/powerpc/kvm/powerpc.c
 @@ -459,6 +459,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
   tasklet_kill(vcpu-arch.tasklet);
  
   kvmppc_remove_vcpu_debugfs(vcpu);
 +
 + switch (vcpu-arch.irqchip_type) {
 + case KVMPPC_IRQCHIP_MPIC:
 + mpic_put(vcpu-arch.irqchip_priv);
 + break;
 + }

This is going to break bisection, since you don't define mpic_put() in
this patch.

 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 74d0ff3..20ce2d2 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -668,6 +668,7 @@ struct kvm_ppc_smmu_info {
  #define KVM_CAP_PPC_EPR 86
  #define KVM_CAP_ARM_PSCI 87
  #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 +#define KVM_CAP_DEVICE_CTRL 89
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
 @@ -909,6 +910,32 @@ struct kvm_s390_ucas_mapping {
  #define KVM_ARM_SET_DEVICE_ADDR_IOW(KVMIO,  0xab, struct 
 kvm_arm_device_addr)
  
  /*
 + * Device control API, available with KVM_CAP_DEVICE_CTRL
 + */
 +#define KVM_CREATE_DEVICE_TEST   1
 +
 +struct kvm_create_device {
 + __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 + __u32   fd; /* out: device handle */
 + __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 +
 +struct kvm_device_attr {
 + __u32   flags;  /* no flags currently defined */
 + __u32   group;  /* device-defined */
 + __u64   attr;   /* group-defined */
 + __u64   addr;   /* userspace address of attr data */
 +};
 +
 +/* ioctl for vm fd */
 +#define KVM_CREATE_DEVICE  _IOWR(KVMIO,  0xe0, struct kvm_create_device)

This define should go with the other VM ioctls, otherwise the next
person to add a VM ioctl will probably miss it and reuse the 0xe0
code.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info 

Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread Scott Wood

On 04/02/2013 08:02:39 PM, Paul Mackerras wrote:

On Mon, Apr 01, 2013 at 05:47:48PM -0500, Scott Wood wrote:
 Currently, devices that are emulated inside KVM are configured in a
 hardcoded manner based on an assumption that any given architecture
 only has one way to do it.  If there's any need to access device  
state,

 it is done through inflexible one-purpose-only IOCTLs (e.g.
 KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
 cumbersome and depletes a limited numberspace.

 This API provides a mechanism to instantiate a device of a certain
 type, returning an ID that can be used to set/get attributes of the
 device.  Attributes may include configuration parameters (e.g.
 register base address), device state, operational commands, etc.  It
 is similar to the ONE_REG API, except that it acts on devices rather
 than vcpus.

 Both device types and individual attributes can be tested without  
having

 to create the device or get/set the attribute, without the need for
 separately managing enumerated capabilities.

 Signed-off-by: Scott Wood scottw...@freescale.com

Some comments below...

 diff --git a/Documentation/virtual/kvm/api.txt  
b/Documentation/virtual/kvm/api.txt

 index 976eb65..77328aa 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with  
contents from the data
  written, then `n_invalid' invalid entries, invalidating any  
previously

  valid entries found.

 +4.79 KVM_CREATE_DEVICE
 +
 +Capability: KVM_CAP_DEVICE_CTRL

I notice this patch doesn't add this capability;


Yes, it does (see below).


you add it in a later patch.


Maybe you're thinking of KVM_CAP_IRQ_MPIC?


 +Type: vm ioctl
 +Parameters: struct kvm_create_device (in/out)
 +Returns: 0 on success, -1 on error
 +Errors:
 +  ENODEV: The device type is unknown or unsupported
 +  EEXIST: Device already created, and this type of device may not
 +  be instantiated multiple times
 +  ENOSPC: Too many devices have been created

Is this still a possible error code?


If you mean ENOSPC, probably not -- it'd be replaced with whatever  
errors can come out of creating a file descriptor.



 --- a/arch/powerpc/include/asm/kvm_host.h
 +++ b/arch/powerpc/include/asm/kvm_host.h
 @@ -370,6 +370,9 @@ struct kvmppc_booke_debug_reg {
u64 dac[KVMPPC_BOOKE_MAX_DAC];
  };

 +#define KVMPPC_IRQCHIP_NONE   0
 +#define KVMPPC_IRQCHIP_MPIC   1

This define should go in the patch that adds the MPIC device.

  struct kvm_vcpu_arch {
ulong host_stack;
u32 host_pid;
 @@ -549,6 +552,9 @@ struct kvm_vcpu_arch {
  	unsigned long magic_page_pa; /* phys addr to map the magic page  
to */
  	unsigned long magic_page_ea; /* effect. addr to map the magic  
page to */


 +  int irqchip_type;
 +  void *irqchip_priv;

Since you add this (irqchip_priv) only to remove it in a later patch
and replace it by a device-specific pointer, why bother adding it
here?  And why not give irqchip_type the name it ultimately ends up
with?


Oops... These were patch shuffling accidents and will be removed from  
the next iteration.



 diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
 index 16b4595..bdfa526 100644
 --- a/arch/powerpc/kvm/powerpc.c
 +++ b/arch/powerpc/kvm/powerpc.c
 @@ -459,6 +459,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
tasklet_kill(vcpu-arch.tasklet);

kvmppc_remove_vcpu_debugfs(vcpu);
 +
 +  switch (vcpu-arch.irqchip_type) {
 +  case KVMPPC_IRQCHIP_MPIC:
 +  mpic_put(vcpu-arch.irqchip_priv);
 +  break;
 +  }

This is going to break bisection, since you don't define mpic_put() in
this patch.


Sigh.  Something got messed up; I'll try to sort it out and resubmit.


 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 74d0ff3..20ce2d2 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -668,6 +668,7 @@ struct kvm_ppc_smmu_info {
  #define KVM_CAP_PPC_EPR 86
  #define KVM_CAP_ARM_PSCI 87
  #define KVM_CAP_ARM_SET_DEVICE_ADDR 88
 +#define KVM_CAP_DEVICE_CTRL 89


See, here's the capability. :-)


  /*
 + * Device control API, available with KVM_CAP_DEVICE_CTRL
 + */
 +#define KVM_CREATE_DEVICE_TEST1
 +
 +struct kvm_create_device {
 +  __u32   type;   /* in: KVM_DEV_TYPE_xxx */
 +  __u32   fd; /* out: device handle */
 +  __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
 +};
 +
 +struct kvm_device_attr {
 +  __u32   flags;  /* no flags currently defined */
 +  __u32   group;  /* device-defined */
 +  __u64   attr;   /* group-defined */
 +  __u64   addr;   /* userspace address of attr data */
 +};
 +
 +/* ioctl for vm fd */
 +#define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct  
kvm_create_device)


This define should go with the other VM ioctls, otherwise the next
person to add a VM ioctl will probably miss it and reuse the 0xe0
code.


That's actually why I moved it to a new 

Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread tiejun.chen

On 04/03/2013 01:30 AM, Scott Wood wrote:

On 04/02/2013 01:59:57 AM, tiejun.chen wrote:

On 04/02/2013 06:47 AM, Scott Wood wrote:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ff71541..ed033c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2158,6 +2158,17 @@ out:
  }
  #endif

+static int kvm_ioctl_create_device(struct kvm *kvm,
+   struct kvm_create_device *cd)
+{
+bool test = cd-flags  KVM_CREATE_DEVICE_TEST;
+
+switch (cd-type) {
+default:
+return -ENODEV;
+}


Even after apply patch 5, looks here still misses something like:

if (test)
WARN_ON_ONCE(!cd-type);


Why?  How does userspace passing in a bad type value mean the kernel needs to
report internal badness, why is a value of zero worse than any other bad value,
and why only when the test flag is set?


I just mean we need do something here since looks the 'test' variable is defined 
but unused, right? But please correct this as you expect :)


And if the userspace can't guarantee cd-type is never zero, we should return 
-ENODEV as well after that switch().


Tiejun
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread tiejun.chen

On 04/03/2013 09:34 AM, Scott Wood wrote:

On 04/02/2013 08:28:01 PM, tiejun.chen wrote:

On 04/03/2013 01:30 AM, Scott Wood wrote:

On 04/02/2013 01:59:57 AM, tiejun.chen wrote:

On 04/02/2013 06:47 AM, Scott Wood wrote:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ff71541..ed033c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2158,6 +2158,17 @@ out:
  }
  #endif

+static int kvm_ioctl_create_device(struct kvm *kvm,
+   struct kvm_create_device *cd)
+{
+bool test = cd-flags  KVM_CREATE_DEVICE_TEST;
+
+switch (cd-type) {
+default:
+return -ENODEV;
+}


Even after apply patch 5, looks here still misses something like:

if (test)
WARN_ON_ONCE(!cd-type);


Why?  How does userspace passing in a bad type value mean the kernel needs to
report internal badness, why is a value of zero worse than any other bad value,
and why only when the test flag is set?


I just mean we need do something here since looks the 'test' variable is
defined but unused, right? But please correct this as you expect :)


Yes, it's unused in this patch, but is used after patch 5 is applied.  I didn't
think it was worth adding a temporary unused annotation, since this part of the
kernel doesn't use -Werror.


Yes, its accepted in !-Werror case if we shouldn't warn something as you said.

Tiejun

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-02 Thread Scott Wood
Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 Documentation/virtual/kvm/api.txt   |8 ++
 arch/powerpc/include/asm/kvm_host.h |8 ++
 arch/powerpc/include/asm/kvm_ppc.h  |2 ++
 arch/powerpc/kvm/booke.c|4 ++-
 arch/powerpc/kvm/mpic.c |   49 +++
 arch/powerpc/kvm/powerpc.c  |   26 +++
 include/uapi/linux/kvm.h|1 +
 7 files changed, 92 insertions(+), 6 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index d52f3f9..4c326ae 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
 When disabled (args[0] == 0), behavior is as if this facility is unsupported.
 
 When this capability is enabled, KVM_EXIT_EPR can occur.
+
+6.6 KVM_CAP_IRQ_MPIC
+
+Architectures: ppc
+Parameters: args[0] is the MPIC device fd
+args[1] is the MPIC CPU number for this vcpu
+
+This capability connects the vcpu to an in-kernel MPIC device.
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 7e7aef9..2a2e235 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
u64 dac[KVMPPC_BOOKE_MAX_DAC];
 };
 
+#define KVMPPC_IRQ_DEFAULT 0
+#define KVMPPC_IRQ_MPIC1
+
+struct openpic;
+
 struct kvm_vcpu_arch {
ulong host_stack;
u32 host_pid;
@@ -554,6 +559,9 @@ struct kvm_vcpu_arch {
unsigned long magic_page_pa; /* phys addr to map the magic page to */
unsigned long magic_page_ea; /* effect. addr to map the magic page to */
 
+   int irq_type;   /* one of KVM_IRQ_* */
+   struct openpic *mpic;   /* KVM_IRQ_MPIC */
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
struct kvm_vcpu_arch_shared shregs;
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 3b63b97..f54707f 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -276,6 +276,8 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, 
u32 epr)
 }
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+int kvmppc_mpic_connect_vcpu(struct file *mpic_filp, struct kvm_vcpu *vcpu,
+u32 cpu);
 
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
  struct kvm_config_tlb *cfg);
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cddc6b3..7d00222 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -430,8 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
if (update_epr == true) {
if (vcpu-arch.epr_flags  KVMPPC_EPR_USER)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
-   else if (vcpu-arch.epr_flags  KVMPPC_EPR_KERNEL)
+   else if (vcpu-arch.epr_flags  KVMPPC_EPR_KERNEL) {
+   BUG_ON(vcpu-arch.irq_type != KVMPPC_IRQ_MPIC);
kvmppc_mpic_set_epr(vcpu);
+   }
}
 
new_msr = msr_mask;
diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
index 8cda2fa..caffe3b 100644
--- a/arch/powerpc/kvm/mpic.c
+++ b/arch/powerpc/kvm/mpic.c
@@ -1159,7 +1159,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct 
irq_dest *dst,
 
 void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
 {
-   struct openpic *opp = vcpu-arch.irqchip_priv;
+   struct openpic *opp = vcpu-arch.mpic;
int cpu = vcpu-vcpu_id;
unsigned long flags;
 
@@ -1442,10 +1442,10 @@ static void map_mmio(struct openpic *opp)
 
 static void unmap_mmio(struct openpic *opp)
 {
-   BUG_ON(opp-mmio_mapped);
-   opp-mmio_mapped = false;
-
-   kvm_io_bus_unregister_dev(opp-kvm, KVM_MMIO_BUS, opp-mmio);
+   if (opp-mmio_mapped) {
+   opp-mmio_mapped = false;
+   kvm_io_bus_unregister_dev(opp-kvm, KVM_MMIO_BUS, opp-mmio);
+   }
 }
 
 static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
@@ -1681,6 +1681,45 @@ static const struct file_operations kvm_mpic_fops = {
.release = kvm_mpic_release,
 };
 
+int kvmppc_mpic_connect_vcpu(struct file *mpic_filp, struct kvm_vcpu *vcpu,
+u32 cpu)
+{
+   struct openpic *opp = mpic_filp-private_data;
+   int ret = 0;

[RFC PATCH v3 5/6] kvm/ppc/mpic: in-kernel MPIC emulation

2013-04-02 Thread Scott Wood
Hook the MPIC code up to the KVM interfaces, add locking, etc.

TODO: irqfd support, split up into multiple patches, KVM_IRQ_LINE
support

Signed-off-by: Scott Wood scottw...@freescale.com
---
v3: mpic_put - kvmppc_mpic_put

 Documentation/virtual/kvm/devices/mpic.txt |   37 ++
 arch/powerpc/include/asm/kvm_host.h|8 +-
 arch/powerpc/include/asm/kvm_ppc.h |7 +
 arch/powerpc/kvm/Kconfig   |5 +
 arch/powerpc/kvm/Makefile  |2 +
 arch/powerpc/kvm/booke.c   |   10 +-
 arch/powerpc/kvm/mpic.c|  814 +---
 arch/powerpc/kvm/powerpc.c |   12 +-
 include/linux/kvm_host.h   |2 +
 include/uapi/linux/kvm.h   |9 +
 virt/kvm/kvm_main.c|9 +
 11 files changed, 714 insertions(+), 201 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/mpic.txt

diff --git a/Documentation/virtual/kvm/devices/mpic.txt 
b/Documentation/virtual/kvm/devices/mpic.txt
new file mode 100644
index 000..79e000a
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/mpic.txt
@@ -0,0 +1,37 @@
+MPIC interrupt controller
+=
+
+Device types supported:
+  KVM_DEV_TYPE_FSL_MPIC_20 Freescale MPIC v2.0
+  KVM_DEV_TYPE_FSL_MPIC_42 Freescale MPIC v4.2
+
+Only one MPIC instance, of any type, may be instantiated.  The created
+MPIC will act as the system interrupt controller, connecting to each
+vcpu's interrupt inputs.
+
+Groups:
+  KVM_DEV_MPIC_GRP_MISC
+  Attributes:
+KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
+  Base address of the 256 KiB MPIC register space.  Must be
+  naturally aligned.  A value of zero disables the mapping.
+  Reset value is zero.
+
+  KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
+Access an MPIC register, as if the access were made from the guest. 
+attr is the byte offset into the MPIC register space.  Accesses
+must be 4-byte aligned.
+
+MSIs may be signaled by using this attribute group to write
+to the relevant MSIIR.
+
+  KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
+IRQ input line for each standard openpic source.  0 is inactive and 1
+is active, regardless of interrupt sense.
+
+For edge-triggered interrupts:  Writing 1 is considered an activating
+edge, and writing 0 is ignored.  Reading returns 1 if a previously
+signaled edge has not been acknowledged, and 0 otherwise.
+
+attr is the IRQ number.  IRQ numbers for standard sources are the
+byte offset of the relevant IVPR from EIVPR0, divided by 32.
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e34f8fe..7e7aef9 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -359,6 +359,11 @@ struct kvmppc_slb {
 #define KVMPPC_BOOKE_MAX_IAC   4
 #define KVMPPC_BOOKE_MAX_DAC   2
 
+/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
+#define KVMPPC_EPR_NONE0 /* EPR not supported */
+#define KVMPPC_EPR_USER1 /* exit to userspace to fill EPR */
+#define KVMPPC_EPR_KERNEL  2 /* in-kernel irqchip */
+
 struct kvmppc_booke_debug_reg {
u32 dbcr0;
u32 dbcr1;
@@ -522,7 +527,7 @@ struct kvm_vcpu_arch {
u8 sane;
u8 cpu_type;
u8 hcall_needed;
-   u8 epr_enabled;
+   u8 epr_flags; /* KVMPPC_EPR_xxx */
u8 epr_needed;
 
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
@@ -589,5 +594,6 @@ struct kvm_vcpu_arch {
 #define KVM_MMIO_REG_FQPR  0x0060
 
 #define __KVM_HAVE_ARCH_WQP
+#define __KVM_HAVE_CREATE_DEVICE
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index f589307..3b63b97 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -164,6 +164,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
 
 extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
 
+int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
@@ -245,6 +247,9 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union 
kvmppc_one_reg *);
 
 void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
 
+struct openpic;
+void kvmppc_mpic_put(struct openpic *opp);
+
 #ifdef CONFIG_KVM_BOOK3S_64_HV
 static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
 {
@@ -270,6 +275,8 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, 
u32 epr)
 #endif
 }
 
+void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
+
 int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
  struct kvm_config_tlb *cfg);
 int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/Kconfig 

[RFC PATCH v3 2/6] kvm/ppc/mpic: import hw/openpic.c from QEMU

2013-04-02 Thread Scott Wood
This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e (Update version for
1.4.0-rc0), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/mpic.c | 1686 +++
 1 file changed, 1686 insertions(+)
 create mode 100644 arch/powerpc/kvm/mpic.c

diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
new file mode 100644
index 000..57655b9
--- /dev/null
+++ b/arch/powerpc/kvm/mpic.c
@@ -0,0 +1,1686 @@
+/*
+ * OpenPIC emulation
+ *
+ * Copyright (c) 2004 Jocelyn Mayer
+ *   2011 Alexander Graf
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+/*
+ *
+ * Based on OpenPic implementations:
+ * - Intel GW80314 I/O companion chip developer's manual
+ * - Motorola MPC8245  MPC8540 user manuals.
+ * - Motorola MCP750 (aka Raven) programmer manual.
+ * - Motorola Harrier programmer manuel
+ *
+ * Serial interrupts, as implemented in Raven chipset are not supported yet.
+ *
+ */
+#include hw.h
+#include ppc/mac.h
+#include pci/pci.h
+#include openpic.h
+#include sysbus.h
+#include pci/msi.h
+#include qemu/bitops.h
+#include ppc.h
+
+//#define DEBUG_OPENPIC
+
+#ifdef DEBUG_OPENPIC
+static const int debug_openpic = 1;
+#else
+static const int debug_openpic = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+if (debug_openpic) { \
+printf(fmt , ## __VA_ARGS__); \
+} \
+} while (0)
+
+#define MAX_CPU 32
+#define MAX_SRC 256
+#define MAX_TMR 4
+#define MAX_IPI 4
+#define MAX_MSI 8
+#define MAX_IRQ (MAX_SRC + MAX_IPI + MAX_TMR)
+#define VID 0x03   /* MPIC version ID */
+
+/* OpenPIC capability flags */
+#define OPENPIC_FLAG_IDR_CRIT (1  0)
+#define OPENPIC_FLAG_ILR  (2  0)
+
+/* OpenPIC address map */
+#define OPENPIC_GLB_REG_START0x0
+#define OPENPIC_GLB_REG_SIZE 0x10F0
+#define OPENPIC_TMR_REG_START0x10F0
+#define OPENPIC_TMR_REG_SIZE 0x220
+#define OPENPIC_MSI_REG_START0x1600
+#define OPENPIC_MSI_REG_SIZE 0x200
+#define OPENPIC_SUMMARY_REG_START   0x3800
+#define OPENPIC_SUMMARY_REG_SIZE0x800
+#define OPENPIC_SRC_REG_START0x1
+#define OPENPIC_SRC_REG_SIZE (MAX_SRC * 0x20)
+#define OPENPIC_CPU_REG_START0x2
+#define OPENPIC_CPU_REG_SIZE 0x100 + ((MAX_CPU - 1) * 0x1000)
+
+/* Raven */
+#define RAVEN_MAX_CPU  2
+#define RAVEN_MAX_EXT 48
+#define RAVEN_MAX_IRQ 64
+#define RAVEN_MAX_TMR  MAX_TMR
+#define RAVEN_MAX_IPI  MAX_IPI
+
+/* Interrupt definitions */
+#define RAVEN_FE_IRQ (RAVEN_MAX_EXT)   /* Internal functional IRQ */
+#define RAVEN_ERR_IRQ(RAVEN_MAX_EXT + 1)   /* Error IRQ */
+#define RAVEN_TMR_IRQ(RAVEN_MAX_EXT + 2)   /* First timer IRQ */
+#define RAVEN_IPI_IRQ(RAVEN_TMR_IRQ + RAVEN_MAX_TMR)   /* First IPI 
IRQ */
+/* First doorbell IRQ */
+#define RAVEN_DBL_IRQ(RAVEN_IPI_IRQ + (RAVEN_MAX_CPU * RAVEN_MAX_IPI))
+
+typedef struct FslMpicInfo {
+   int max_ext;
+} FslMpicInfo;
+
+static FslMpicInfo fsl_mpic_20 = {
+   .max_ext = 12,
+};
+
+static FslMpicInfo fsl_mpic_42 = {
+   .max_ext = 12,
+};
+
+#define FRR_NIRQ_SHIFT16
+#define FRR_NCPU_SHIFT 8
+#define FRR_VID_SHIFT  0
+
+#define VID_REVISION_1_2   2
+#define VID_REVISION_1_3   3
+
+#define VIR_GENERIC  0x/* Generic Vendor ID */
+
+#define GCR_RESET0x8000
+#define GCR_MODE_PASS0x
+#define GCR_MODE_MIXED   0x2000
+#define GCR_MODE_PROXY   0x6000
+
+#define TBCR_CI   0x8000   /* count inhibit */
+#define TCCR_TOG  0x8000   /* toggles when decrement to zero */
+
+#define IDR_EP_SHIFT  31
+#define IDR_EP_MASK   (1  IDR_EP_SHIFT)

[RFC PATCH v3 1/6] kvm: add device control API

2013-04-02 Thread Scott Wood
Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it.  If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.

This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device.  Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc.  It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.

Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.

Signed-off-by: Scott Wood scottw...@freescale.com
---
v3: remove some changes that were merged into this patch by accident,
and fix the error documentation for KVM_CREATE_DEVICE.

NOTE: I had some difficulty figuring out what ioctl numbers I should
assign...  it seems that at one point care was taken to keep vcpu and
vm ioctls separate, but some overlap exists now (despite not exhausing
the ioctl space).  Some of that was my fault, but not all of it. :-)
I moved to a new ioctl range for device control -- please let me know
if there's something else you'd prefer I do.
---
 Documentation/virtual/kvm/api.txt|   70 ++
 Documentation/virtual/kvm/devices/README |1 +
 include/uapi/linux/kvm.h |   27 
 virt/kvm/kvm_main.c  |   31 +
 4 files changed, 129 insertions(+)
 create mode 100644 Documentation/virtual/kvm/devices/README

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 976eb65..d52f3f9 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents from 
the data
 written, then `n_invalid' invalid entries, invalidating any previously
 valid entries found.
 
+4.79 KVM_CREATE_DEVICE
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: vm ioctl
+Parameters: struct kvm_create_device (in/out)
+Returns: 0 on success, -1 on error
+Errors:
+  ENODEV: The device type is unknown or unsupported
+  EEXIST: Device already created, and this type of device may not
+  be instantiated multiple times
+
+  Other error conditions may be defined by individual device types or
+  have their standard meanings.
+
+Creates an emulated device in the kernel.  The file descriptor returned
+in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
+
+If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
+device type is supported (not necessarily whether it can be created
+in the current vm).
+
+Individual devices should not define flags.  Attributes should be used
+for specifying any behavior that is not implied by the device type
+number.
+
+struct kvm_create_device {
+   __u32   type;   /* in: KVM_DEV_TYPE_xxx */
+   __u32   fd; /* out: device handle */
+   __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
+};
+
+4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+  EPERM:  The attribute cannot (currently) be accessed this way
+  (e.g. read-only attribute, or attribute that only makes
+  sense when the device is in a different state)
+
+  Other error conditions may be defined by individual device types.
+
+Gets/sets a specified piece of device configuration and/or state.  The
+semantics are device-specific.  See individual device documentation in
+the devices directory.  As with ONE_REG, the size of the data
+transferred is defined by the particular attribute.
+
+struct kvm_device_attr {
+   __u32   flags;  /* no flags currently defined */
+   __u32   group;  /* device-defined */
+   __u64   attr;   /* group-defined */
+   __u64   addr;   /* userspace address of attr data */
+};
+
+4.81 KVM_HAS_DEVICE_ATTR
+
+Capability: KVM_CAP_DEVICE_CTRL
+Type: device ioctl
+Parameters: struct kvm_device_attr
+Returns: 0 on success, -1 on error
+Errors:
+  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+Tests whether a device supports a particular attribute.  A successful
+return indicates the attribute is implemented.  It does not necessarily
+indicate that the attribute can be read or written in the device's
+current state.  addr is ignored.
 
 4.77 KVM_ARM_VCPU_INIT
 
diff --git a/Documentation/virtual/kvm/devices/README 
b/Documentation/virtual/kvm/devices/README
new file mode 100644
index 000..34a6983

Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-02 Thread Paul Mackerras
On Tue, Apr 02, 2013 at 08:19:56PM -0500, Scott Wood wrote:
 On 04/02/2013 08:02:39 PM, Paul Mackerras wrote:
 On Mon, Apr 01, 2013 at 05:47:48PM -0500, Scott Wood wrote:
  +4.79 KVM_CREATE_DEVICE
  +
  +Capability: KVM_CAP_DEVICE_CTRL
 
 I notice this patch doesn't add this capability;
 
 Yes, it does (see below).
 
 you add it in a later patch.
 
 Maybe you're thinking of KVM_CAP_IRQ_MPIC?

No, I was referring to the addition to kvm_dev_ioctl_check_extension()
of a KVM_CAP_DEVICE_CTRL case.  Since this patch adds the code to handle
KVM_CREATE_DEVICE ioctl it should also add the code to return 1 if
userspace queries the KVM_CAP_DEVICE_CTRL capability.

  +/* ioctl for vm fd */
  +#define KVM_CREATE_DEVICE   _IOWR(KVMIO,  0xe0, struct
 kvm_create_device)
 
 This define should go with the other VM ioctls, otherwise the next
 person to add a VM ioctl will probably miss it and reuse the 0xe0
 code.
 
 That's actually why I moved it to a new section, with device control
 ioctls getting their own range, as the legacy device model and
 some other things did.  0xe0 is not the next ioctl that would be
 used for either vm or vcpu.  The ioctl numbering is actually already
 a mess, with sometimes care being taken to keep vcpu and vm ioctls
 from overlapping, but on other places overlapping does happen.  I'm
 not sure what exactly I should do here.

Well, even if you are using a new range, I still think that
KVM_CREATE_DEVICE, being a VM ioctl, should go next to the other VM
ioctls.  I guess it's ultimately up to the maintainers.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html