Re: [kvm-devel] [PATCH] Add irqdevice interface + userint implementation

Gregory Haskins Mon, 09 Apr 2007 11:37:35 -0700

>>> On Sun, Apr 8, 2007 at  2:47 AM, in message <[EMAIL PROTECTED]>,
Avi Kivity <[EMAIL PROTECTED]> wrote: 
> Gregory Haskins wrote:
>> Hi All,
>>   The following is another patch that I broke out of the in- kernel- apic 
>> patch 
> that I sent earlier.  This focuses purely on the irqdevice abstraction 
> without introducing the concept of the in- kernel LAPIC.  It also provides an 
> implementation of a "userint" model, which should support a system with a 
> QEMU based emulation of the (A)PIC (e.g. current behavior).  I have cleaned 
> up quite a few things based on all of your comments as well, so hopefully its 
> a little more polished than last time.
>>
>> As far as testing, I have confirmed that I can boot a 64- bit linux guest 
>> with 
> no discernable difference in behavior.
>>
>> Note that this patch applies after the in- kernel- mmio.patch that I sent 
> earlier today which has not yet been accepted/commited.  This patch should 
> not depend on it.  However, if you do have problems applying it let me know 
> and I can generate another one that is not after in- kernel- mmio in the 
> series.
>>   
> 
> A patchset is fine.
> 
>> ---  a/drivers/kvm/Makefile
>> +++ b/drivers/kvm/Makefile
>> @@ - 2,7 +2,7 @@
>>  # Makefile for Kernel- based Virtual Machine module
>>  #
>>  
>> - kvm- objs := kvm_main.o mmu.o x86_emulate.o
>> +kvm- objs := kvm_main.o mmu.o x86_emulate.o kvm_userint.o
>>   
> 
> kvm_ prefixes are not needed for drivers/kvm/.  kvm_main is an 
> historical exception.


Ack


> 
>>  obj- $(CONFIG_KVM) += kvm.o
>>  kvm- intel- objs = vmx.o
>>  obj- $(CONFIG_KVM_INTEL) += kvm- intel.o
>> diff -- git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
>> index c1923df..6caf60b 100644
>> ---  a/drivers/kvm/kvm.h
>> +++ b/drivers/kvm/kvm.h
>> @@ - 13,6 +13,7 @@
>>  #include <linux/mm.h>
>>  
>>  #include "vmx.h"
>> +#include "kvm_irqdevice.h"
>>   
> 
> Ditto.
> 

Ack

>> +
>> +#ifndef __KVM_IRQDEVICE_H
>> +#define __KVM_IRQDEVICE_H
>> +
>> +#define KVM_IRQFLAGS_NMI  (1 << 0)
>> +#define KVM_IRQFLAGS_PEEK (1 << 1)
>>   
> 
> is PEEK still needed?

I believe we still need pending + read...I am not sure about PEEK anymore.  I 
will review this and get back to you.

> 
>> +
>> +struct kvm_irqdevice;
>> +
>> +struct kvm_irqsink {
>> +    void (*raise_intr)(struct kvm_irqsink *this, 
>> +                       struct kvm_irqdevice *dev);
>> +
>> +    void *private;
>> +};
>> +
>> +struct kvm_irqdevice {
>> +    int  (*pending)(struct kvm_irqdevice *this, int flags);
>> +    int  (*read_vector)(struct kvm_irqdevice *this, int flags); 
>>   
> 
> I'm a bit confused here.  The kvm_irqsink mechanism implies a push 
> mechanism.  - >pending and - >read_vector imply a pull mechanism.

What I am modeling is the concept of an PIC INTR + INTA.  INTR serves to just 
cause the processor to realize it needs to service an interrupt.  The INTA 
cycle is what then gets invoked after INTR to figure out which vector is 
appropriate.  Finally, the vector is called out of the IDT.  

However, there is no actual INTA cycle on a virtualized guest.  We emulate an 
INTA cycle by calling our pending/read routines and injecting the vector into 
the VMCS.  Whats missing from the current HEAD is that there is no concept of 
INTR.  This is whats new, and its why we have both a push and a pull mechanism. 
 The old way only had the pull.  Does this make sense?  

> 
>> +
>> +/*
>> + * kvm_irq_init()
>> + *
>> + * Description:
>> + *    Initialized the kvm_irqdevice for use.  Should be called before 
> calling
>> + *    any derived implementation init functions
>> + * 
>> + * Parameters:
>> + *    struct kvm_irqdevice *dev: This device
>> + *
>> + * Returns: (void)
>> + */
>>   
> 
> This is infinitely better than the usual kvm documentation, but it would 
> be better still to follow kerneldoc.

Ok, thats two votes for kerneldoc style.  I will change this.

> 
>> +static inline void kvm_irq_init(struct kvm_irqdevice *dev)
>> +{
>> +    memset(dev, 0, sizeof(*dev));
>> +}
>> +
>> +/*
>> + * kvm_irq_pending()
>> + *
>> + * Description:
>> + *    Efficiently determines if an interrupt is pending on an irqdevice
>> + *
>> + * Parameters:
>> + *    struct kvm_irqdevice *dev: The device
>> + *    int flags: Modifies the behavior as follows:
>> + * 
>> + *                   KVM_IRQFLAGS_NMI: Mask everything but NMIs
>> + *
>> + * Returns: (int)
>> + *   - 1 = failure
>> + *    0 = no iterrupts pending (per "flags" criteria)
>> + *   >0 = one or more interrupts are pending
>> + */
>> +static inline int kvm_irq_pending(struct kvm_irqdevice *dev, int flags)
>> +{
>> +    return dev- >pending(dev, flags);
>> +}
>>   
> 
> Usually - errno is returned for failure.  Can this reasonably fail, 
> though?  it would simplify upper layers if it couldn't.

I think we can just get rid of the error since we cant reasonably fail.

> 
>> +
>> +/*
>> + * kvm_irq_set_pin()
>> + *
>> + * Description:
>> + *    Allows the caller to assert/deassert an IRQ input pin to the device 
>> + *    according to device policy.
>> + *
>> + * Parameters:
>> + *    struct kvm_irqdevice *dev: The device
>> + *    int pin: The input pin to alter
>> + *    int level: The value to set (1 = assert, 0 = deassert)
>> + *    int flags: Reserved, must be 0
>>   
> 
> Just remove it for now.  When we find a use for it, we can add it.  It's 
> only userspace interfaces that are brittle.

Ack

> 
> 
>> +    /* walk the interrupt- bitmap and inject an IRQ for each bit found */
>> +    for (i = 0; i < NR_IRQ_WORDS; ++i) {
>> +            unsigned long word = sregs- >interrupt_bitmap[i];
>> +            while(word) {
>>   
> 
> spaceafterwhile.  Also, can just use a single loop with __test_bit(), as 
> we're not performance critical here.

Ack

> 
>> +                    int bit_index = __ffs(word);
>> +                    int irq = i * BITS_PER_LONG + bit_index;
>> +                    
>> +                    kvm_irq_set_pin(&vcpu- >irq_dev, irq, 1, 0);
>> +
>> +                    clear_bit(bit_index, &word);
>> +            }
>> +    }
>>  
>>      set_segment(vcpu, &sregs- >cs, VCPU_SREG_CS);
>>      set_segment(vcpu, &sregs- >ds, VCPU_SREG_DS);
>> @@ - 2318,6 +2317,33 @@ out1:
>>   
> 
>> diff -- git a/drivers/kvm/kvm_userint.c b/drivers/kvm/kvm_userint.c
>> new file mode 100644
>> index 0000000..ded3098
>> ---  /dev/null
>> +++ b/drivers/kvm/kvm_userint.c
>> @@ - 0,0 +1,201 @@
>> +/*
>> + * kvm_userint.c: User Interrupts IRQ device 
>> + *
>> + * This acts as an extention of an interrupt controller that exists 
> elsewhere 
>> + * (typically in userspace/QEMU).  Because this PIC is a pseudo device that
>> + * is downstream from a real emulated PIC, the "IRQ- to- vector" mapping 
>> has 
>> + * already occured.  Therefore, this PIC has the following unusal 
> properties:
>> + *
>> + * 1) It has 256 "pins" which are literal vectors (e.g.no translation)
>> + * 2) It only supports "auto- EOI" behavior since it is expected that the
>> + *    upstream emulated PIC will handle the real EOIs (if applicable)
>> + * 3) It only listens to "asserts" on the pins (deasserts are dropped) 
>> + *    because its an auto- EOI device anyway.
>>   
> 
> Wierd, but well explained.
> 
>> +static inline int bitarray_findhighest(struct bitarray *this)
>> +{
>> +    if (!this- >summary)
>> +            return - 1;
>> +    else {
>> +        int word_index = __ffs(this- >summary);
>> +        int bit_index  = __ffs(this- >pending[word_index]);
>> +
>> +        return word_index * BITS_PER_LONG + bit_index;      
>> +    }
>> +}
>>   
> 
> Um, this is the lowest?

Opps...carry over from the original code.  I forgot to reverse the polarity.

> 
>> +
>> +/*----------------------------------------------------------------------
>> + * userint interface -  provides the actual kvm_irqdevice implementation
>> + *--------------------------------------------------------------------- */
>> +
>> +typedef struct {
>> +    spinlock_t      lock;
>> +    struct bitarray irq_pending;
>> +    int             nmi_pending;
>> +}kvm_userint;
>>   
> 
> No typedefs in kernel code for ordinary structs.

Ack

> 
>> +
>> +static int userint_pending(struct kvm_irqdevice *this, int flags)
>> +{
>> +    kvm_userint *s = (kvm_userint*)this- >private;
>> +    int ret;
>> +
>> +    spin_lock_irq(&s- >lock);
>> +
>> +    if (flags & KVM_IRQFLAGS_NMI)
>> +            ret = s- >nmi_pending;
>> +    else
>> +            ret = bitarray_pending(&s- >irq_pending);
>>   
> 
> You probably want:
> 
>     ret = s- >nmi_pending;
>     if (!(flags & KVM_IRQFLAGS_NMI))
>           ret |= bitarray_pending(...);
> 
> To avoid calling this twice when interrupts are enabled.

I think there is a misunderstanding.  The return value from this function 
should just be 0 for false, and non-zero for true.  The bitarray still holds a 
bit for IRQ2 so you technically wouldn't need to call this twice when 
interrupts are enabled.  Simply calling this without the flag set will give you 
the proper behavior, so I think it is fine as written.  It also doesn't hurt to 
do it the way you wrote it either.  Let me know if I misunderstood you.


> 
>> +static int userint_read_vector(struct kvm_irqdevice *this, int flags)
>> +{
>> +    kvm_userint *s = (kvm_userint*)this- >private;
>> +    int          irq;
>> +
>> +    spin_lock_irq(&s- >lock);
>> +
>> +    /* NMIs take priority, so if there is an NMI pending, or
>> +     * if we are filtering out NMIs, only consider them 
>> +     */
>> +    if (s- >nmi_pending || (flags & KVM_IRQFLAGS_NMI))
>> +            irq = s- >nmi_pending ? 2 : - 1;
>> +    else
>> +            irq = bitarray_findhighest(&s- >irq_pending);
>>   
> 
> Same here.  Actually the comment is correct (even though it does not 
> start with a /* on a line by itself).

Ack

> 
> 
> In general, looks very good.  The interface is a bit over- rich 
> (- >pending, - >read_vector, and irq_sink) but that's not a showstopper.

Regards,
-Greg




-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] Add irqdevice interface + userint implementation

Reply via email to