On 18.04.2013, at 23:39, Scott Wood wrote:

> On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> Now that all the irq routing and irqfd pieces are generic, we can expose
>> real irqchip support to all of KVM's internal helpers.
>> This allows us to use irqfd with the in-kernel MPIC.
>> Signed-off-by: Alexander Graf <ag...@suse.de>
>> ---
>> arch/powerpc/include/asm/kvm_host.h |    7 ++
>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
>> arch/powerpc/kvm/Kconfig            |    3 +
>> arch/powerpc/kvm/Makefile           |    1 +
>> arch/powerpc/kvm/irq.h              |   17 ++++++
>> arch/powerpc/kvm/mpic.c             |  106 
>> +++++++++++++++++++++++++++++++++++
>> 6 files changed, 135 insertions(+), 0 deletions(-)
>> create mode 100644 arch/powerpc/kvm/irq.h
>> diff --git a/arch/powerpc/include/asm/kvm_host.h 
>> b/arch/powerpc/include/asm/kvm_host.h
>> index 36368c9..d5fbb4b 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -44,6 +44,10 @@
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>> #endif
>> +/* These values are internal and can be increased later */
>> +#define KVM_NR_IRQCHIPS          1
>> +#define KVM_IRQCHIP_NUM_PINS     256
>> +
>> #if !defined(CONFIG_KVM_440)
>> #include <linux/mmu_notifier.h>
>> @@ -256,6 +260,9 @@ struct kvm_arch {
>> #ifdef CONFIG_PPC_BOOK3S_64
>>      struct list_head spapr_tce_tables;
>> #endif
>> +#ifdef CONFIG_KVM_MPIC
>> +    void *mpic;
>> +#endif
>> };
> 
> This can be "struct openpic *mpic" -- we already do this in the vcpu.

There was a reason I made it void *, but I don't remember what it was. I can 
certainly try to make it struct openpic * again :).

> 
>> diff --git a/arch/powerpc/kvm/irq.h b/arch/powerpc/kvm/irq.h
>> new file mode 100644
>> index 0000000..f1e27fd
>> --- /dev/null
>> +++ b/arch/powerpc/kvm/irq.h
>> @@ -0,0 +1,17 @@
>> +#ifndef __IRQ_H
>> +#define __IRQ_H
>> +
>> +#include <linux/kvm_host.h>
>> +
>> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> +{
>> +    int ret = 0;
>> +
>> +#ifdef CONFIG_KVM_MPIC
>> +    ret = ret || (kvm->arch.mpic != NULL);
>> +#endif
>> +    smp_rmb();
>> +    return ret;
>> +}
> 
> Couldn't we just set a non-irqchip-specific bool?  Though eventually this
> shouldn't be needed at all -- instead the check would be "does the
> requested irqchip fd exist and refer to something that exposes an irqchip
> interface"?

How would we get the irqchip id?

> The rmb needs documentation.  I don't see a corresponding wmb before
> writing to kvm->arch.mpic.

I actually just copied it from the x86 code, wondering what the rmb() is 
supposed to do here. Do we need this at all?

> 
>> diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c
>> index b1ae02d..c8de2f2 100644
>> --- a/arch/powerpc/kvm/mpic.c
>> +++ b/arch/powerpc/kvm/mpic.c
>> @@ -1087,6 +1087,8 @@ static int openpic_cpu_write_internal(void *opaque, 
>> gpa_t addr,
>>              }
>>              IRQ_resetbit(&dst->servicing, s_IRQ);
>> +            /* Notify listeners that the IRQ is over */
>> +            kvm_notify_acked_irq(opp->kvm, 0, s_IRQ);
> 
> I don't think we want to call that with the lock held -- it looks like it
> can call back into kvm_set_irq.  In our old internal version I waited

Yeah, usually it would call into a non-mpic set handler though IIUC, so we're 
safe. However, if you want to use an MPIC input pin as resample fd, then you'd 
be lost here.

> until the end of the EOI code, and called the notify function with the
> lock dropped temporarily.

Sounds like a good idea.

> 
>>              /* Set up next servicing IRQ */
>>              s_IRQ = IRQ_get_next(opp, &dst->servicing);
>>              /* Check queued interrupts. */
>> @@ -1639,14 +1641,42 @@ static void mpic_destroy(struct kvm_device *dev)
>>              unmap_mmio(opp);
>>      }
>> +    dev->kvm->arch.mpic = NULL;
>>      kfree(opp);
>> }
>> +static int mpic_set_default_irq_routing(struct openpic *opp)
>> +{
>> +    int i;
>> +    struct kvm_irq_routing_entry *routing;
>> +
>> +    /* XXX be more dynamic if we ever want to support multiple MPIC chips */
>> +    routing = kzalloc((sizeof(*routing) * opp->nb_irqs), GFP_KERNEL);
>> +    if (!routing)
>> +            return -ENOMEM;
>> +
>> +    for (i = 0; i < opp->nb_irqs; i++) {
>> +            routing[i].gsi = i;
>> +            routing[i].type = KVM_IRQ_ROUTING_IRQCHIP;
>> +            routing[i].u.irqchip.irqchip = 0;
>> +            routing[i].u.irqchip.pin = i;
>> +    }
>> +
>> +    kvm_set_irq_routing(opp->kvm, routing, opp->nb_irqs, 0);
>> +
>> +    kfree(routing);
>> +    return 0;
>> +}
> 
> Do we really want any default routes?  There's no platform notion of GSI
> here, so how is userspace to know how the kernel set it up (or what GSIs
> are free to be used for new routes) -- other than the "read the code"
> answer I got when I asked about x86?  :-P

The "default routes" really are just "expose all pins 1:1 as GSI". I think it 
makes sense to have a simple default for user space that doesn't want to mess 
with irq routing.

What GSIs are free for new routes doesn't matter. Routes are always completely 
rewritten as a while from user space. So when user space goes in and wants to 
change only a single line it has to lay out the full map itself anyway.

> 
>> +int kvm_set_routing_entry(struct kvm_irq_routing_table *rt,
>> +                      struct kvm_kernel_irq_routing_entry *e,
>> +                      const struct kvm_irq_routing_entry *ue)
>> +{
>> +    int r = -EINVAL;
>> +
>> +    switch (ue->type) {
>> +    case KVM_IRQ_ROUTING_IRQCHIP:
>> +            e->set = mpic_set_irq;
>> +            e->irqchip.irqchip = ue->u.irqchip.irqchip;
>> +            e->irqchip.pin = ue->u.irqchip.pin;
>> +            if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS)
>> +                    goto out;
>> +            rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
>> +            break;
>> +    case KVM_IRQ_ROUTING_MSI:
>> +            e->set = kvm_set_msi;
>> +            e->msi.address_lo = ue->u.msi.address_lo;
>> +            e->msi.address_hi = ue->u.msi.address_hi;
>> +            e->msi.data = ue->u.msi.data;
>> +            break;
>> +    default:
>> +            goto out;
>> +    }
>> +
>> +    r = 0;
>> +out:
>> +    return r;
>> +}
> 
> How would one create a route for IPIs, timers, etc (we have had customers
> wanting to assign those to KVM guests)?  What about error interrupts on
> MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't

"pin" is basically what a "source line" is on the MPIC. That's what the 
equivalent of an IOAPIC interrupt line would be for the MPIC world.

IPIs, timer interrupts and error interrupts are special vectors. We could 
probably model them as different KVM_IRQ_ROUTING_ types if we ever need to 
support mapping those to irqfd. For simple injection we can always do an ioctl 
on the MPIC device.

However, I'd be inclined to say that it's rather unlikely you'd want to have a 
vfio device's interrupt line hooked up to the IPI interrupt of your guest...

> there be an API documentation update for this?

Hrm. I can certainly add one.

> We also need to document whach irqchip means, if we want to reserve the
> ability to use it in the future -- otherwise userspace could fill in any
> old junk there and we'd need to retain compatibility.  It should probably
> be the fd of the MPIC.

It can't, because irqchip is an index into an array. We'd have to add another 
layer of indirection if we want to find the fd to a certain irqchip. That's why 
I simply restricted it to always be "0" now.

> It looks like you only support having one type of irqchip built into
> the kernel at a time?  That may be OK for now given the existing
> restrictions for building KVM, but it should be noted as a
> limitation/TODO.

I don't think it's really hard to add support for another irqchip type. But we 
certainly have harder issues to tackle first before we end up in a situation 
where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.

> BTW, thanks for taking this on -- it would have taken me a while to
> digest the existing code.

It helps to know what the x86 words mean :)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to