Re: [RFC PATCH v3 5/6] kvm/ppc/mpic: in-kernel MPIC emulation

2013-04-04 Thread Alexander Graf

On 04.04.2013, at 00:54, Scott Wood wrote:

> On 04/03/2013 05:12:06 PM, Alexander Graf wrote:
>> Am 04.04.2013 um 00:07 schrieb Scott Wood :
>> > On 04/03/2013 04:58:56 PM, Alexander Graf wrote:
>> >> Am 03.04.2013 um 23:38 schrieb Scott Wood :
>> >> > On 04/03/2013 11:19:42 AM, Alexander Graf wrote:
>> >> >> On 03.04.2013, at 03:57, Scott Wood wrote:
>> >> >> > Hook the MPIC code up to the KVM interfaces, add locking, etc.
>> >> >> >
>> >> >> > TODO: irqfd support, split up into multiple patches, KVM_IRQ_LINE
>> >> >> > support
>> >> >> >
>> >> >> > Signed-off-by: Scott Wood 
>> >> >> > ---
>> >> >> > v3: mpic_put -> kvmppc_mpic_put
>> >> >> >
>> >> >> >
>> >> >> [...]
>> >> >> > +void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
>> >> >> > +
>> >> >> > int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
>> >> >> >  struct kvm_config_tlb *cfg);
>> >> >> > int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
>> >> >> > diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
>> >> >> > index 63c67ec..a87139b 100644
>> >> >> > --- a/arch/powerpc/kvm/Kconfig
>> >> >> > +++ b/arch/powerpc/kvm/Kconfig
>> >> >> > @@ -151,6 +151,11 @@ config KVM_E500MC
>> >> >> >
>> >> >> >  If unsure, say N.
>> >> >> >
>> >> >> > +config KVM_MPIC
>> >> >> > +bool "KVM in-kernel MPIC emulation"
>> >> >> > +depends on KVM
>> >> >> This should probably depend on FSL KVM for now, until someone adds 
>> >> >> support for other MPIC revisions.
>> >> >
>> >> > I don't see a symbol specifically for "FSL KVM".  What part of the MPIC 
>> >> > code depends on booke or any FSL-specific code?
>> >> You support only FSL mpic device IDs :). So if someone on book3s goes 
>> >> along and sees this, he'd think "yes, I want an in-kernel MPIC", enables 
>> >> the option and wastes space.
>> >
>> > "Would this waste space" is not generally the criteria for kconfig 
>> > dependencies.  Who is the kernel to get in the way of someone that wants 
>> > an FSL MPIC on a 4xx VM? :-)
>> >
>> > And again, there's no symbol for FSL KVM -- I'd have to use a list that 
>> > could get out of date.  And it would reduce build testing in 
>> > allyesconfig-type configs.
>> Ok, please indicate compatibility limitations in the Kconfig description at 
>> least then.
> 
> OK -- not really a "compatibility" limitation so much as what models are 
> supported.
> 
> Note that mpc86xx has a 74xx-derived core, but also has an FSL MPIC...
> 
> Is 74xx/e600 supported by book3s_pr?  Can't tell from the kconfig text. :-)

On book3s_pr we don't have a good compatibility check mechanism in place. 
That's really suboptimal.

What I'm saying is that Kconfig should say "In-kernel emulation of FSL MPIC 2.0 
and FSL MPIC 4.2 interrupt controllers. Say Y here if you plan to run KVM on an 
FSL system". That'd be in line with what you can actually enable using the 
ioctls and leaves the decision whether that's a good thing to the user.

Code-wise there shouldn't be any dependency on host or guest architecture of 
course.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls

2013-04-04 Thread Alexander Graf

On 04.04.2013, at 07:37, Paul Mackerras wrote:

> On Thu, Mar 21, 2013 at 09:52:16AM +0100, Alexander Graf wrote:
>>> +/* Platform specific hcalls, used by KVM */
>>> +#define H_RTAS 0xf000
>> 
>> How about you define a different hcall ID for this? Then QEMU would
>> create its "rtas entry blob" such that KVM-routed RTAS handling goes
>> to KVM directly.
> 
> QEMU can still do that, and I don't see that it would change the
> kernel side if it did.  We would still have to have agreement between
> the kernel and userspace as to what the hcall number for invoking the
> in-kernel RTAS calls was, and the kernel would still have to keep a
> list of token numbers and how they correspond to the functions it
> provides.  The only thing different would be that the in-kernel RTAS
> hcall could return to the guest if it didn't recognize the token
> number, rather than pushing the problem up to userspace.  However,
> that wouldn't make the code any simpler, and it isn't a situation
> where performance is an issue.
> 
> Do you see some kernel-side improvements or simplifications from your
> suggestion that I'm missing?  Remember, the guest gets the token
> numbers from the device tree (properties under the /rtas node), so
> they are under the control of userspace/QEMU.

The code flow with this patch:

  

  foreach (override in overrides)
ioctl(OVERRIDE_RTAS, ...);

  

  switch (hcall_id) {
  case QEMU_RTAS_ID:
foreach (override in kvm_overrides) {
  int rtas_id = ...;
  if (override.rtas_id == rtas_id) {
handle_rtas();
handled = true;
  }
}
if (!handled)
  pass_to_qemu();
break;
  default:
pass_to_qemu();
break
  }

What I'm suggesting:

  

  nothing from KVM's point of view

  

  switch (hcall_id) {
  case KVM_RTAS_ID:
handle_rtas();
break;
  default:
pass_to_qemu();
break;
  }


Which one looks easier and less error prone to you? :)

Speaking of which, how does user space know that the kernel actually supports a 
specific RTAS token?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 1/6] kvm: add device control API

2013-04-04 Thread Gleb Natapov
On Wed, Apr 03, 2013 at 07:39:52PM +0200, Alexander Graf wrote:
> 
> On 03.04.2013, at 19:37, Scott Wood wrote:
> 
> > On 04/03/2013 08:22:37 AM, Alexander Graf wrote:
> >> On 03.04.2013, at 04:17, Paul Mackerras wrote:
> >> > On Tue, Apr 02, 2013 at 08:19:56PM -0500, Scott Wood wrote:
> >> >> On 04/02/2013 08:02:39 PM, Paul Mackerras wrote:
> >> >>> On Mon, Apr 01, 2013 at 05:47:48PM -0500, Scott Wood wrote:
> >>  +4.79 KVM_CREATE_DEVICE
> >>  +
> >>  +Capability: KVM_CAP_DEVICE_CTRL
> >> >>>
> >> >>> I notice this patch doesn't add this capability;
> >> >>
> >> >> Yes, it does (see below).
> >> >>
> >> >>> you add it in a later patch.
> >> >>
> >> >> Maybe you're thinking of KVM_CAP_IRQ_MPIC?
> >> >
> >> > No, I was referring to the addition to kvm_dev_ioctl_check_extension()
> >> > of a KVM_CAP_DEVICE_CTRL case.  Since this patch adds the code to handle
> >> > KVM_CREATE_DEVICE ioctl it should also add the code to return 1 if
> >> > userspace queries the KVM_CAP_DEVICE_CTRL capability.
> >> >
> >>  +/* ioctl for vm fd */
> >>  +#define KVM_CREATE_DEVICE  _IOWR(KVMIO,  0xe0, struct
> >> >>> kvm_create_device)
> >> >>>
> >> >>> This define should go with the other VM ioctls, otherwise the next
> >> >>> person to add a VM ioctl will probably miss it and reuse the 0xe0
> >> >>> code.
> >> >>
> >> >> That's actually why I moved it to a new section, with device control
> >> >> ioctls getting their own range, as the legacy "device model" and
> >> >> some other things did.  0xe0 is not the next ioctl that would be
> >> >> used for either vm or vcpu.  The ioctl numbering is actually already
> >> >> a mess, with sometimes care being taken to keep vcpu and vm ioctls
> >> >> from overlapping, but on other places overlapping does happen.  I'm
> >> >> not sure what exactly I should do here.
> >> >
> >> > Well, even if you are using a new range, I still think that
> >> > KVM_CREATE_DEVICE, being a VM ioctl, should go next to the other VM
> >> > ioctls.  I guess it's ultimately up to the maintainers.
> >> I agree. Things get confusing for VM ioctls otherwise.
> > 
> > Things are already confusing. :-)
> > 
> > I can move KVM_CREATE_DEVICE back with the other VM ioctls, but what number 
> > should it get?  The last VM ioctl is 0xab (which is also a VCPU ioctl).  
> > Should I use 0xac (which is also a VCPU ioctl)?  Or should I try to avoid a 
> > conflict, as was sometimes done in the past -- in which case, which number 
> > should I use?
> 
> Gleb, Marcelo?
> 
> 
Yes, ioctls number assignments are a little bit of a mess :( There are 14
numbers that are used twice and some of them are used twice for the same
type of fd, but with different direction bits and there is one, 0x9b,
that is used twice and have the same IO direction and even, potentially,
same parameter size (just checked it has the same parameter size)! Good that
the second use is only on IA64 which is almost dead.

Although ioctls numbers between different types of fds should not conflict
lets try and make them unique. Scott, please put KVM_CREATE_DEVICE near
device model ioctls and give it a unique number.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 1/6] kvm: add device control API

2013-04-04 Thread Gleb Natapov
On Tue, Apr 02, 2013 at 08:57:48PM -0500, Scott Wood wrote:
> Currently, devices that are emulated inside KVM are configured in a
> hardcoded manner based on an assumption that any given architecture
> only has one way to do it.  If there's any need to access device state,
> it is done through inflexible one-purpose-only IOCTLs (e.g.
> KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
> cumbersome and depletes a limited numberspace.
> 
> This API provides a mechanism to instantiate a device of a certain
> type, returning an ID that can be used to set/get attributes of the
> device.  Attributes may include configuration parameters (e.g.
> register base address), device state, operational commands, etc.  It
> is similar to the ONE_REG API, except that it acts on devices rather
> than vcpus.
> 
> Both device types and individual attributes can be tested without having
> to create the device or get/set the attribute, without the need for
> separately managing enumerated capabilities.
> 
> Signed-off-by: Scott Wood 
> ---
> v3: remove some changes that were merged into this patch by accident,
> and fix the error documentation for KVM_CREATE_DEVICE.
> 
> NOTE: I had some difficulty figuring out what ioctl numbers I should
> assign...  it seems that at one point care was taken to keep vcpu and
> vm ioctls separate, but some overlap exists now (despite not exhausing
> the ioctl space).  Some of that was my fault, but not all of it. :-)
> I moved to a new ioctl range for device control -- please let me know
> if there's something else you'd prefer I do.
> ---
>  Documentation/virtual/kvm/api.txt|   70 
> ++
>  Documentation/virtual/kvm/devices/README |1 +
>  include/uapi/linux/kvm.h |   27 
>  virt/kvm/kvm_main.c  |   31 +
>  4 files changed, 129 insertions(+)
>  create mode 100644 Documentation/virtual/kvm/devices/README
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 976eb65..d52f3f9 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2173,6 +2173,76 @@ header; first `n_valid' valid entries with contents 
> from the data
>  written, then `n_invalid' invalid entries, invalidating any previously
>  valid entries found.
>  
> +4.79 KVM_CREATE_DEVICE
> +
> +Capability: KVM_CAP_DEVICE_CTRL
> +Type: vm ioctl
> +Parameters: struct kvm_create_device (in/out)
> +Returns: 0 on success, -1 on error
> +Errors:
> +  ENODEV: The device type is unknown or unsupported
> +  EEXIST: Device already created, and this type of device may not
> +  be instantiated multiple times
> +
> +  Other error conditions may be defined by individual device types or
> +  have their standard meanings.
> +
> +Creates an emulated device in the kernel.  The file descriptor returned
> +in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
> +
> +If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
> +device type is supported (not necessarily whether it can be created
> +in the current vm).
> +
> +Individual devices should not define flags.  Attributes should be used
> +for specifying any behavior that is not implied by the device type
> +number.
> +
> +struct kvm_create_device {
> + __u32   type;   /* in: KVM_DEV_TYPE_xxx */
> + __u32   fd; /* out: device handle */
> + __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
> +};
> +
> +4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
> +
> +Capability: KVM_CAP_DEVICE_CTRL
> +Type: device ioctl
> +Parameters: struct kvm_device_attr
> +Returns: 0 on success, -1 on error
> +Errors:
> +  ENXIO:  The group or attribute is unknown/unsupported for this device
> +  EPERM:  The attribute cannot (currently) be accessed this way
> +  (e.g. read-only attribute, or attribute that only makes
> +  sense when the device is in a different state)
> +
> +  Other error conditions may be defined by individual device types.
> +
> +Gets/sets a specified piece of device configuration and/or state.  The
> +semantics are device-specific.  See individual device documentation in
> +the "devices" directory.  As with ONE_REG, the size of the data
> +transferred is defined by the particular attribute.
> +
> +struct kvm_device_attr {
> + __u32   flags;  /* no flags currently defined */
> + __u32   group;  /* device-defined */
> + __u64   attr;   /* group-defined */
> + __u64   addr;   /* userspace address of attr data */
> +};
> +
Since now each device has its own fd is it an advantage to enforce
common interface between different devices? If we do so though why
not handle file creation, ioctl and file descriptor lifetime in the
common code. Common code will have "struct kvm_device" with "struct
kvm_device_arch" and "struct kvm_device_ops" members. Instead of
kvm_mpic_ioctl there will be kvm_device_ioctl which will despatch ioctls
to a devi

Re: [RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-04 Thread Alexander Graf

On 03.04.2013, at 03:57, Scott Wood wrote:

> Enabling this capability connects the vcpu to the designated in-kernel
> MPIC.  Using explicit connections between vcpus and irqchips allows
> for flexibility, but the main benefit at the moment is that it
> simplifies the code -- KVM doesn't need vm-global state to remember
> which MPIC object is associated with this vm, and it doesn't need to
> care about ordering between irqchip creation and vcpu creation.
> 
> Signed-off-by: Scott Wood 
> ---
> Documentation/virtual/kvm/api.txt   |8 ++
> arch/powerpc/include/asm/kvm_host.h |8 ++
> arch/powerpc/include/asm/kvm_ppc.h  |2 ++
> arch/powerpc/kvm/booke.c|4 ++-
> arch/powerpc/kvm/mpic.c |   49 +++
> arch/powerpc/kvm/powerpc.c  |   26 +++
> include/uapi/linux/kvm.h|1 +
> 7 files changed, 92 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index d52f3f9..4c326ae 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2728,3 +2728,11 @@ to receive the topmost interrupt vector.
> When disabled (args[0] == 0), behavior is as if this facility is unsupported.
> 
> When this capability is enabled, KVM_EXIT_EPR can occur.
> +
> +6.6 KVM_CAP_IRQ_MPIC
> +
> +Architectures: ppc
> +Parameters: args[0] is the MPIC device fd
> +args[1] is the MPIC CPU number for this vcpu
> +
> +This capability connects the vcpu to an in-kernel MPIC device.
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 7e7aef9..2a2e235 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -375,6 +375,11 @@ struct kvmppc_booke_debug_reg {
>   u64 dac[KVMPPC_BOOKE_MAX_DAC];
> };
> 
> +#define KVMPPC_IRQ_DEFAULT   0
> +#define KVMPPC_IRQ_MPIC  1
> +
> +struct openpic;
> +
> struct kvm_vcpu_arch {
>   ulong host_stack;
>   u32 host_pid;
> @@ -554,6 +559,9 @@ struct kvm_vcpu_arch {
>   unsigned long magic_page_pa; /* phys addr to map the magic page to */
>   unsigned long magic_page_ea; /* effect. addr to map the magic page to */
> 
> + int irq_type;   /* one of KVM_IRQ_* */
> + struct openpic *mpic;   /* KVM_IRQ_MPIC */
> +
> #ifdef CONFIG_KVM_BOOK3S_64_HV
>   struct kvm_vcpu_arch_shared shregs;
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index 3b63b97..f54707f 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -276,6 +276,8 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, 
> u32 epr)
> }
> 
> void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
> +int kvmppc_mpic_connect_vcpu(struct file *mpic_filp, struct kvm_vcpu *vcpu,
> +  u32 cpu);
> 
> int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
> struct kvm_config_tlb *cfg);
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index cddc6b3..7d00222 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -430,8 +430,10 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
> *vcpu,
>   if (update_epr == true) {
>   if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
>   kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
> - else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL)
> + else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
> + BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
>   kvmppc_mpic_set_epr(vcpu);
> + }
>   }
> 
>   new_msr &= msr_mask;
> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
> index 8cda2fa..caffe3b 100644
> --- a/arch/powerpc/kvm/mpic.c
> +++ b/arch/powerpc/kvm/mpic.c
> @@ -1159,7 +1159,7 @@ static uint32_t openpic_iack(struct openpic *opp, 
> struct irq_dest *dst,
> 
> void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
> {
> - struct openpic *opp = vcpu->arch.irqchip_priv;
> + struct openpic *opp = vcpu->arch.mpic;
>   int cpu = vcpu->vcpu_id;
>   unsigned long flags;
> 
> @@ -1442,10 +1442,10 @@ static void map_mmio(struct openpic *opp)
> 
> static void unmap_mmio(struct openpic *opp)
> {
> - BUG_ON(opp->mmio_mapped);
> - opp->mmio_mapped = false;
> -
> - kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
> + if (opp->mmio_mapped) {
> + opp->mmio_mapped = false;
> + kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, &opp->mmio);
> + }
> }
> 
> static int set_base_addr(struct openpic *opp, struct kvm_device_attr *attr)
> @@ -1681,6 +1681,45 @@ static const struct file_operations kvm_mpic_fops = {
>   .release = kvm_mpic_release,
> };
> 

Re: [PATCH] bookehv: Handle debug exception on guest exit

2013-04-04 Thread Alexander Graf

On 20.03.2013, at 18:45, Bharat Bhushan wrote:

> EPCR.DUVD controls whether the debug events can come in
> hypervisor mode or not. When KVM guest is using the debug
> resource then we do not want debug events to be captured
> in guest entry/exit path. So we set EPCR.DUVD when entering
> and clears EPCR.DUVD when exiting from guest.
> 
> Debug instruction complete is a post-completion debug
> exception but debug event gets posted on the basis of MSR
> before the instruction is executed. Now if the instruction
> switches the context from guest mode (MSR.GS = 1) to hypervisor
> mode (MSR.GS = 0) then the xSRR0 points to first instruction of
> KVM handler and xSRR1 points that MSR.GS is clear
> (hypervisor context). Now as xSRR1.GS is used to decide whether
> KVM handler will be invoked to handle the exception or host
> host kernel debug handler will be invoked to handle the exception.
> This leads to host kernel debug handler handling the exception
> which should either be handled by KVM.
> 
> This is tested on e500mc in 32 bit mode
> 
> Signed-off-by: Bharat Bhushan 
> ---
> v0:
> - Do not apply this change for debug_crit as we do not know those chips have 
> issue or not.
> - corrected 64bit case branching
> 
> arch/powerpc/kernel/exceptions-64e.S |   29 -
> arch/powerpc/kernel/head_booke.h |   26 ++
> 2 files changed, 54 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64e.S 
> b/arch/powerpc/kernel/exceptions-64e.S
> index 4684e33..8b26294 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -516,6 +516,33 @@ kernel_dbg_exc:
>   andis.  r15,r14,DBSR_IC@h
>   beq+1f
> 
> +#ifdef CONFIG_KVM_BOOKE_HV
> + /*
> +  * EPCR.DUVD controls whether the debug events can come in
> +  * hypervisor mode or not. When KVM guest is using the debug
> +  * resource then we do not want debug events to be captured
> +  * in guest entry/exit path. So we set EPCR.DUVD when entering
> +  * and clears EPCR.DUVD when exiting from guest.
> +  * Debug instruction complete is a post-completion debug
> +  * exception but debug event gets posted on the basis of MSR
> +  * before the instruction is executed. Now if the instruction
> +  * switches the context from guest mode (MSR.GS = 1) to hypervisor
> +  * mode (MSR.GS = 0) then the xSRR0 points to first instruction of

Can't we just execute that code path with MSR.DE=0?


Alex

> +  * KVM handler and xSRR1 points that MSR.GS is clear
> +  * (hypervisor context). Now as xSRR1.GS is used to decide whether
> +  * KVM handler will be invoked to handle the exception or host
> +  * host kernel debug handler will be invoked to handle the exception.
> +  * This leads to host kernel debug handler handling the exception
> +  * which should either be handled by KVM.
> +  */
> + mfspr   r10, SPRN_EPCR
> + andis.  r10,r10,SPRN_EPCR_DUVD@h
> + beq+2f
> +
> + andis.  r10,r9,MSR_GS@h
> + beq+3f
> +2:
> +#endif
>   LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
>   LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
>   cmpld   cr0,r10,r14
> @@ -523,7 +550,7 @@ kernel_dbg_exc:
>   blt+cr0,1f
>   bge+cr1,1f
> 
> - /* here it looks like we got an inappropriate debug exception. */
> +3:   /* here it looks like we got an inappropriate debug exception. */
>   lis r14,DBSR_IC@h   /* clear the IC event */
>   rlwinm  r11,r11,0,~MSR_DE   /* clear DE in the DSRR1 value */
>   mtspr   SPRN_DBSR,r14
> diff --git a/arch/powerpc/kernel/head_booke.h 
> b/arch/powerpc/kernel/head_booke.h
> index 5f051ee..edc6a3b 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -285,7 +285,33 @@ label:
>   mfspr   r10,SPRN_DBSR;  /* check single-step/branch taken */  \
>   andis.  r10,r10,(DBSR_IC|DBSR_BT)@h;  \
>   beq+2f;   \
> +#ifdef CONFIG_KVM_BOOKE_HV \
> + /*\
> +  * EPCR.DUVD controls whether the debug events can come in\
> +  * hypervisor mode or not. When KVM guest is using the debug  \
> +  * resource then we do not want debug events to be captured   \
> +  * in guest entry/exit path. So we set EPCR.DUVD when entering\
> +  * and clears EPCR.DUVD when exiting from guest.  \
> +  * Debug instruction complete is a post-completion debug  \
> +  * exception but debug event gets posted on the basis of MSR  \
> +  * before the instruction is executed. Now if the instruction \
> +  * switches the context from guest mode (MSR.GS = 1) to hypervisor

RE: [PATCH 1/7 v2] KVM: PPC: e500: Expose MMU registers via ONE_REG

2013-04-04 Thread Caraman Mihai Claudiu-B02008
> -Original Message-
> From: Wood Scott-B07421
> Sent: Wednesday, March 27, 2013 12:43 AM
> To: Caraman Mihai Claudiu-B02008
> Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de
> Subject: Re: [PATCH 1/7 v2] KVM: PPC: e500: Expose MMU registers via
> ONE_REG
> 
> On 03/26/2013 05:05:06 PM, Mihai Caraman wrote:
> > +int kvmppc_set_one_reg_e500_tlb(struct kvm_vcpu *vcpu, u64 id,
> > +  union kvmppc_one_reg *val)
> > +{
> > +   int r = 0;
> > +   long int i;
> > +
> > +   switch (id) {
> > +   case KVM_REG_PPC_MAS0:
> > +   vcpu->arch.shared->mas0 = set_reg_val(id, *val);
> > +   break;
> > +   case KVM_REG_PPC_MAS1:
> > +   vcpu->arch.shared->mas1 = set_reg_val(id, *val);
> > +   break;
> > +   case KVM_REG_PPC_MAS2:
> > +   vcpu->arch.shared->mas2 = set_reg_val(id, *val);
> > +   break;
> > +   case KVM_REG_PPC_MAS7_3:
> > +   vcpu->arch.shared->mas7_3 = set_reg_val(id, *val);
> > +   break;
> > +   case KVM_REG_PPC_MAS4:
> > +   vcpu->arch.shared->mas4 = set_reg_val(id, *val);
> > +   break;
> > +   case KVM_REG_PPC_MAS6:
> > +   vcpu->arch.shared->mas6 = set_reg_val(id, *val);
> > +   break;
> > +   /* Only allow MMU registers to be set to the config supported
> > by KVM */
> > +   case KVM_REG_PPC_MMUCFG: {
> > +   if (set_reg_val(id, *val) != vcpu->arch.mmucfg)
> > +   r = -EINVAL;
> > +   break;
> > +   }
> > +   case KVM_REG_PPC_TLB0CFG:
> > +   case KVM_REG_PPC_TLB1CFG:
> > +   case KVM_REG_PPC_TLB2CFG:
> > +   case KVM_REG_PPC_TLB3CFG: {
> > +   /* MMU geometry (N_ENTRY/ASSOC) can be set only using
> > SW_TLB */
> > +   i = id - KVM_REG_PPC_TLB0CFG;
> > +   if (set_reg_val(id, *val) != vcpu->arch.tlbcfg[i])
> > +   r = -EINVAL;
> > +   break;
> > +   }
> 
> Am I the only one that finds the set_reg_val/get_reg_val naming
> confusing?  At first glance it looks like it sets TLBnCFG and then
> later tests whether it should have. :-P
> 
> Functions should be named for what they do, not for what context you
> use them in.

I also found the existing code difficult to read. If we stick to reg and
val varible names what about val_to_reg/reg_to_val.
 
-Mike

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] bookehv: Handle debug exception on guest exit

2013-04-04 Thread Bhushan Bharat-R65777


> -Original Message-
> From: Alexander Graf [mailto:ag...@suse.de]
> Sent: Thursday, April 04, 2013 6:55 PM
> To: Bhushan Bharat-R65777
> Cc: linuxppc-...@lists.ozlabs.org; k...@vger.kernel.org; 
> kvm-ppc@vger.kernel.org;
> Wood Scott-B07421; Bhushan Bharat-R65777
> Subject: Re: [PATCH] bookehv: Handle debug exception on guest exit
> 
> 
> On 20.03.2013, at 18:45, Bharat Bhushan wrote:
> 
> > EPCR.DUVD controls whether the debug events can come in hypervisor
> > mode or not. When KVM guest is using the debug resource then we do not
> > want debug events to be captured in guest entry/exit path. So we set
> > EPCR.DUVD when entering and clears EPCR.DUVD when exiting from guest.
> >
> > Debug instruction complete is a post-completion debug exception but
> > debug event gets posted on the basis of MSR before the instruction is
> > executed. Now if the instruction switches the context from guest mode
> > (MSR.GS = 1) to hypervisor mode (MSR.GS = 0) then the xSRR0 points to
> > first instruction of KVM handler and xSRR1 points that MSR.GS is clear
> > (hypervisor context). Now as xSRR1.GS is used to decide whether KVM
> > handler will be invoked to handle the exception or host host kernel
> > debug handler will be invoked to handle the exception.
> > This leads to host kernel debug handler handling the exception which
> > should either be handled by KVM.
> >
> > This is tested on e500mc in 32 bit mode
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> > v0:
> > - Do not apply this change for debug_crit as we do not know those chips have
> issue or not.
> > - corrected 64bit case branching
> >
> > arch/powerpc/kernel/exceptions-64e.S |   29 -
> > arch/powerpc/kernel/head_booke.h |   26 ++
> > 2 files changed, 54 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/exceptions-64e.S
> > b/arch/powerpc/kernel/exceptions-64e.S
> > index 4684e33..8b26294 100644
> > --- a/arch/powerpc/kernel/exceptions-64e.S
> > +++ b/arch/powerpc/kernel/exceptions-64e.S
> > @@ -516,6 +516,33 @@ kernel_dbg_exc:
> > andis.  r15,r14,DBSR_IC@h
> > beq+1f
> >
> > +#ifdef CONFIG_KVM_BOOKE_HV
> > +   /*
> > +* EPCR.DUVD controls whether the debug events can come in
> > +* hypervisor mode or not. When KVM guest is using the debug
> > +* resource then we do not want debug events to be captured
> > +* in guest entry/exit path. So we set EPCR.DUVD when entering
> > +* and clears EPCR.DUVD when exiting from guest.
> > +* Debug instruction complete is a post-completion debug
> > +* exception but debug event gets posted on the basis of MSR
> > +* before the instruction is executed. Now if the instruction
> > +* switches the context from guest mode (MSR.GS = 1) to hypervisor
> > +* mode (MSR.GS = 0) then the xSRR0 points to first instruction of
> 
> Can't we just execute that code path with MSR.DE=0?

Single stepping uses DBCR0.IC (instruction complete).
Can you describe how MSR.DE = 0 will work?

> 
> 
> Alex
> 
> > +* KVM handler and xSRR1 points that MSR.GS is clear
> > +* (hypervisor context). Now as xSRR1.GS is used to decide whether
> > +* KVM handler will be invoked to handle the exception or host
> > +* host kernel debug handler will be invoked to handle the exception.
> > +* This leads to host kernel debug handler handling the exception
> > +* which should either be handled by KVM.
> > +*/
> > +   mfspr   r10, SPRN_EPCR
> > +   andis.  r10,r10,SPRN_EPCR_DUVD@h
> > +   beq+2f
> > +
> > +   andis.  r10,r9,MSR_GS@h
> > +   beq+3f
> > +2:
> > +#endif
> > LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
> > LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
> > cmpld   cr0,r10,r14
> > @@ -523,7 +550,7 @@ kernel_dbg_exc:
> > blt+cr0,1f
> > bge+cr1,1f
> >
> > -   /* here it looks like we got an inappropriate debug exception. */
> > +3: /* here it looks like we got an inappropriate debug exception. */
> > lis r14,DBSR_IC@h   /* clear the IC event */
> > rlwinm  r11,r11,0,~MSR_DE   /* clear DE in the DSRR1 value */
> > mtspr   SPRN_DBSR,r14
> > diff --git a/arch/powerpc/kernel/head_booke.h
> > b/arch/powerpc/kernel/head_booke.h
> > index 5f051ee..edc6a3b 100644
> > --- a/arch/powerpc/kernel/head_booke.h
> > +++ b/arch/powerpc/kernel/head_booke.h
> > @@ -285,7 +285,33 @@ label:
> > mfspr   r10,SPRN_DBSR;  /* check single-step/branch taken */  \
> > andis.  r10,r10,(DBSR_IC|DBSR_BT)@h;  \
> > beq+2f;   \
> > +#ifdef CONFIG_KVM_BOOKE_HV   \
> > +   /*\
> > +* EPCR.DUVD controls whether the debug events can come in\
> > +* hypervisor mode or not. When KVM guest is using the debug  \
> > +

Re: [RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-04 Thread Scott Wood

On 04/04/2013 07:54:20 AM, Alexander Graf wrote:


On 03.04.2013, at 03:57, Scott Wood wrote:

> +  if (opp->mpic_mode_mask == GCR_MODE_PROXY)

Shouldn't this be an &?


The way the mode field was originally documented was a two-bit field,  
where 0b11 was external proxy, and 0b10 was reserved.  If we use & it  
would have to be:


if ((opp->mpic_mode_mask & GCR_MODE_PROXY) == GCR_MODE_PROXY)
...

Simply testing "opp->mpic_mode_mask & GCR_MODE_PROXY" would return true  
in the case of GCR_MODE_MIXED.


In MPIC 4.3 external proxy is defined as a separate bit (GCR[CI]) that  
is ignored if the mixed-mode bit (GCR[M]) is not set, which makes it a  
bit more legitimate to view it as a bitmap.  Still, I doubt we'll see  
new mode bits.



> @@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
>tasklet_kill(&vcpu->arch.tasklet);
>
>kvmppc_remove_vcpu_debugfs(vcpu);
> +
> +  switch (vcpu->arch.irq_type) {
> +  case KVMPPC_IRQ_MPIC:
> +  kvmppc_mpic_put(vcpu->arch.mpic);

This doesn't tell the MPIC that this exact CPU is getting killed.  
What if we hotplug remove just a single CPU? Don't we have to  
deregister the CPU with the MPIC?


If we ever support hot vcpu removal, yes.  We'd probably need some MPIC  
code changes to accommodate that, and we wouldn't currently have a way  
to test it, so I'd rather make it obviously not supported for now.


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-04 Thread Alexander Graf


Am 04.04.2013 um 20:41 schrieb Scott Wood :

> On 04/04/2013 07:54:20 AM, Alexander Graf wrote:
>> On 03.04.2013, at 03:57, Scott Wood wrote:
>> > +if (opp->mpic_mode_mask == GCR_MODE_PROXY)
>> Shouldn't this be an &?
> 
> The way the mode field was originally documented was a two-bit field, where 
> 0b11 was external proxy, and 0b10 was reserved.  If we use & it would have to 
> be:
> 
>if ((opp->mpic_mode_mask & GCR_MODE_PROXY) == GCR_MODE_PROXY)
>...
> 
> Simply testing "opp->mpic_mode_mask & GCR_MODE_PROXY" would return true in 
> the case of GCR_MODE_MIXED.
> 
> In MPIC 4.3 external proxy is defined as a separate bit (GCR[CI]) that is 
> ignored if the mixed-mode bit (GCR[M]) is not set, which makes it a bit more 
> legitimate to view it as a bitmap.  Still, I doubt we'll see new mode bits.

Ok, please add a comment about this here then :).

> 
>> > @@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
>> >tasklet_kill(&vcpu->arch.tasklet);
>> >
>> >kvmppc_remove_vcpu_debugfs(vcpu);
>> > +
>> > +switch (vcpu->arch.irq_type) {
>> > +case KVMPPC_IRQ_MPIC:
>> > +kvmppc_mpic_put(vcpu->arch.mpic);
>> This doesn't tell the MPIC that this exact CPU is getting killed. What if we 
>> hotplug remove just a single CPU? Don't we have to deregister the CPU with 
>> the MPIC?
> 
> If we ever support hot vcpu removal, yes.  We'd probably need some MPIC code 
> changes to accommodate that, and we wouldn't currently have a way to test it, 
> so I'd rather make it obviously not supported for now.

Is there any way to break heavily if user space attempts this?

Alex

> 
> -Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-04 Thread Scott Wood

On 04/04/2013 05:30:05 PM, Alexander Graf wrote:



Am 04.04.2013 um 20:41 schrieb Scott Wood :

> On 04/04/2013 07:54:20 AM, Alexander Graf wrote:
>> On 03.04.2013, at 03:57, Scott Wood wrote:
>> > +if (opp->mpic_mode_mask == GCR_MODE_PROXY)
>> Shouldn't this be an &?
>
> The way the mode field was originally documented was a two-bit  
field, where 0b11 was external proxy, and 0b10 was reserved.  If we  
use & it would have to be:

>
>if ((opp->mpic_mode_mask & GCR_MODE_PROXY) == GCR_MODE_PROXY)
>...
>
> Simply testing "opp->mpic_mode_mask & GCR_MODE_PROXY" would return  
true in the case of GCR_MODE_MIXED.

>
> In MPIC 4.3 external proxy is defined as a separate bit (GCR[CI])  
that is ignored if the mixed-mode bit (GCR[M]) is not set, which  
makes it a bit more legitimate to view it as a bitmap.  Still, I  
doubt we'll see new mode bits.


Ok, please add a comment about this here then :).


What sort of comment would you like?  Or do you want me to use the "(x  
& y) == y" version?


>> > @@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu  
*vcpu)

>> >tasklet_kill(&vcpu->arch.tasklet);
>> >
>> >kvmppc_remove_vcpu_debugfs(vcpu);
>> > +
>> > +switch (vcpu->arch.irq_type) {
>> > +case KVMPPC_IRQ_MPIC:
>> > +kvmppc_mpic_put(vcpu->arch.mpic);
>> This doesn't tell the MPIC that this exact CPU is getting killed.  
What if we hotplug remove just a single CPU? Don't we have to  
deregister the CPU with the MPIC?

>
> If we ever support hot vcpu removal, yes.  We'd probably need some  
MPIC code changes to accommodate that, and we wouldn't currently have  
a way to test it, so I'd rather make it obviously not supported for  
now.


Is there any way to break heavily if user space attempts this?


Is there any way for userspace to request this currently?  They can  
close the vcpu fd, but the vcpu won't actually be destroyed until the  
vm goes down.


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls

2013-04-04 Thread Paul Mackerras
On Thu, Apr 04, 2013 at 11:49:55AM +0200, Alexander Graf wrote:
> 
> On 04.04.2013, at 07:37, Paul Mackerras wrote:
> 
> > On Thu, Mar 21, 2013 at 09:52:16AM +0100, Alexander Graf wrote:
> >>> +/* Platform specific hcalls, used by KVM */
> >>> +#define H_RTAS   0xf000
> >> 
> >> How about you define a different hcall ID for this? Then QEMU would
> >> create its "rtas entry blob" such that KVM-routed RTAS handling goes
> >> to KVM directly.
> > 
> > QEMU can still do that, and I don't see that it would change the
> > kernel side if it did.  We would still have to have agreement between
> > the kernel and userspace as to what the hcall number for invoking the
> > in-kernel RTAS calls was, and the kernel would still have to keep a
> > list of token numbers and how they correspond to the functions it
> > provides.  The only thing different would be that the in-kernel RTAS
> > hcall could return to the guest if it didn't recognize the token
> > number, rather than pushing the problem up to userspace.  However,
> > that wouldn't make the code any simpler, and it isn't a situation
> > where performance is an issue.
> > 
> > Do you see some kernel-side improvements or simplifications from your
> > suggestion that I'm missing?  Remember, the guest gets the token
> > numbers from the device tree (properties under the /rtas node), so
> > they are under the control of userspace/QEMU.
> 
> The code flow with this patch:
> 
>   
> 
>   foreach (override in overrides)
> ioctl(OVERRIDE_RTAS, ...);
> 
>   
> 
>   switch (hcall_id) {
>   case QEMU_RTAS_ID:
> foreach (override in kvm_overrides) {
>   int rtas_id = ...;
>   if (override.rtas_id == rtas_id) {
> handle_rtas();

Actually this is more like: override.handler();

> handled = true;
>   }
> }
> if (!handled)
>   pass_to_qemu();
> break;
>   default:
> pass_to_qemu();
> break
>   }
> 
> What I'm suggesting:
> 
>   
> 
>   nothing from KVM's point of view

Actually, this can't be "nothing".

The way the RTAS calls work is that there is a name and a "token"
(32-bit integer value) for each RTAS call.  The tokens have to be
unique for each different name.  Userspace puts the names and tokens
in the device tree under the /rtas node (a set of properties where the
property name is the RTAS function name and the property value is the
token).  The guest looks up the token for each RTAS function it wants
to use, and passes the token in the argument buffer for the RTAS call.

This means that userspace has to know the names and tokens for all
supported RTAS functions, both the ones implemented in the kernel and
the ones implemented in userspace.

Also, the token numbers are pretty arbitrary, and the token numbers
for the kernel-implemented RTAS functions could be chosen by userspace
or by the kernel.  If they're chosen by the kernel, then userspace
needs a way to discover them (so it can put them in the device tree),
and also has to avoid choosing any token numbers for its functions
that collide with a kernel-chosen token.  If userspace chooses the
token numbers, it has to tell the kernel what token numbers it has
chosen for the kernel-implemented RTAS functions.  We chose the latter
since it gives userspace more control.

So this  code has to be either (your suggestion):

foreach RTAS function possibly implemented in kernel {
query kernel token for function, by name
if that gives an error, mark function as needing to be
implemented in userspace
}
(userspace) allocate tokens for remaining functions,
avoiding collisions with kernel-chosen tokens

or else it is (my suggestion):

(userspace) allocate tokens for all RTAS functions
foreach RTAS function possibly implemented in kernel {
tell kernel the (name, token) correspondence
}

>   
> 
>   switch (hcall_id) {
>   case KVM_RTAS_ID:
> handle_rtas();

Here, you've compressed details that you expanded in your pseudo-code
above, making this a less than fair comparison.  This handle_rtas()
function has to fetch the token and branch out to the appropriate
handler routine.  Whether that's a switch statement or a loop over
registered handlers doesn't make all that much difference.

> break;
>   default:
> pass_to_qemu();
> break;
>   }
> 
> 
> Which one looks easier and less error prone to you? :)
> 
> Speaking of which, how does user space know that the kernel actually
> supports a specific RTAS token? 

It's really the names that are more important, the tokens are pretty
arbitrary.  In my scheme, userspace does a KVM_PPC_RTAS_DEFINE_TOKEN
ioctl giving the name and the (userspace-chosen) token, which gets an
error if the kernel doesn't recognize the name.  In your scheme, there
would have to be an equivalent ioctl to query the (kernel-chosen)
token for a given name, which once again would return an error if the
kernel doesn't recognize the name.  Either way the kernel 

Re: [RFC PATCH v3 5/6] kvm/ppc/mpic: in-kernel MPIC emulation

2013-04-04 Thread Scott Wood

On 04/04/2013 12:59:02 AM, Gleb Natapov wrote:

On Wed, Apr 03, 2013 at 03:58:04PM -0500, Scott Wood wrote:
> KVM_DEV_MPIC_* could go elsewhere if you want to avoid cluttering
> the main kvm.h.  The arch header would be OK, since the non-arch
> header includes the arch header, and thus it wouldn't be visible to
> userspace where it is -- if there later is a need for MPIC (or
> whatever other device follows MPIC's example) on another
> architecture, it could be moved without breaking anything.  Or, we
> could just have a header for each device type.
>
If device will be used by more then one arch it will move into  
virt/kvm

and will have its own header, like ioapic.


virt/kvm/ioapic.h is not uapi.  The ioapic uapi component (e.g. struct  
kvm_ioapic_state) is duplicated between x86 and ia64, which is the sort  
of thing I'd like to avoid.  I'm OK with putting it in the PPC header  
if, upon a later need for multi-architecture support, it could move  
into either the main uapi header or a separate uapi header that the  
main uapi header includes (i.e. no userspace-visible change in which  
header needs to be included).


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 1/6] kvm: add device control API

2013-04-04 Thread Scott Wood

On 04/04/2013 05:41:35 AM, Gleb Natapov wrote:

On Tue, Apr 02, 2013 at 08:57:48PM -0500, Scott Wood wrote:
> +struct kvm_device_attr {
> +  __u32   flags;  /* no flags currently defined */
> +  __u32   group;  /* device-defined */
> +  __u64   attr;   /* group-defined */
> +  __u64   addr;   /* userspace address of attr data */
> +};
> +
Since now each device has its own fd is it an advantage to enforce
common interface between different devices?


I think so, even if only to avoid repeating the various pains  
surrounding adding ioctls.  Not necessarily "enforce", just enable.  If  
a device has some sort of command that does not fit neatly into the  
"set or get" model, it could still add a new ioctl.


If we do so though why not handle file creation, ioctl and file  
descriptor lifetime in the

common code. Common code will have "struct kvm_device" with "struct
kvm_device_arch" and "struct kvm_device_ops" members. Instead of
kvm_mpic_ioctl there will be kvm_device_ioctl which will despatch  
ioctls

to a device using kvm_device->ops->(set|get|has)_attr pointers.


So make it more like the pre-fd version, except for the actual fd  
usage?  It would make destruction a bit simpler (assuming there's no  
need for vcpu destruction code to access a device).  Hopefully nobody  
asks me to change it back again, though. :-)


-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 1/6] kvm: add device control API

2013-04-04 Thread Paul Mackerras
On Thu, Apr 04, 2013 at 01:41:35PM +0300, Gleb Natapov wrote:

> Since now each device has its own fd is it an advantage to enforce
> common interface between different devices? If we do so though why
> not handle file creation, ioctl and file descriptor lifetime in the
> common code. Common code will have "struct kvm_device" with "struct
> kvm_device_arch" and "struct kvm_device_ops" members. Instead of
> kvm_mpic_ioctl there will be kvm_device_ioctl which will despatch ioctls
> to a device using kvm_device->ops->(set|get|has)_attr pointers.

I thought about making the same request, but when I looked at it, the
amount of code that could be made common in this way is pretty tiny,
and doing that involves a bit of extra complexity, so I thought that
on the whole it wouldn't be worthwhile.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3 6/6] kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC

2013-04-04 Thread Alexander Graf


Am 05.04.2013 um 00:35 schrieb Scott Wood :

> On 04/04/2013 05:30:05 PM, Alexander Graf wrote:
>> Am 04.04.2013 um 20:41 schrieb Scott Wood :
>> > On 04/04/2013 07:54:20 AM, Alexander Graf wrote:
>> >> On 03.04.2013, at 03:57, Scott Wood wrote:
>> >> > +if (opp->mpic_mode_mask == GCR_MODE_PROXY)
>> >> Shouldn't this be an &?
>> >
>> > The way the mode field was originally documented was a two-bit field, 
>> > where 0b11 was external proxy, and 0b10 was reserved.  If we use & it 
>> > would have to be:
>> >
>> >if ((opp->mpic_mode_mask & GCR_MODE_PROXY) == GCR_MODE_PROXY)
>> >...
>> >
>> > Simply testing "opp->mpic_mode_mask & GCR_MODE_PROXY" would return true in 
>> > the case of GCR_MODE_MIXED.
>> >
>> > In MPIC 4.3 external proxy is defined as a separate bit (GCR[CI]) that is 
>> > ignored if the mixed-mode bit (GCR[M]) is not set, which makes it a bit 
>> > more legitimate to view it as a bitmap.  Still, I doubt we'll see new mode 
>> > bits.
>> Ok, please add a comment about this here then :).
> 
> What sort of comment would you like?  Or do you want me to use the "(x & y) 
> == y" version?

/* This might need to be changed if GCR gets extended */

> 
>> >> > @@ -460,6 +464,13 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
>> >> >tasklet_kill(&vcpu->arch.tasklet);
>> >> >
>> >> >kvmppc_remove_vcpu_debugfs(vcpu);
>> >> > +
>> >> > +switch (vcpu->arch.irq_type) {
>> >> > +case KVMPPC_IRQ_MPIC:
>> >> > +kvmppc_mpic_put(vcpu->arch.mpic);
>> >> This doesn't tell the MPIC that this exact CPU is getting killed. What if 
>> >> we hotplug remove just a single CPU? Don't we have to deregister the CPU 
>> >> with the MPIC?
>> >
>> > If we ever support hot vcpu removal, yes.  We'd probably need some MPIC 
>> > code changes to accommodate that, and we wouldn't currently have a way to 
>> > test it, so I'd rather make it obviously not supported for now.
>> Is there any way to break heavily if user space attempts this?
> 
> Is there any way for userspace to request this currently?  They can close the 
> vcpu fd, but the vcpu won't actually be destroyed until the vm goes down.

Are you sure? X86 does CPU hotplug today, so there has to be something :)

Alex

> 
> -Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html