Gregory Haskins wrote:
> Hi All,
> The following is another patch that I broke out of the in-kernel-apic patch
> that I sent earlier. This focuses purely on the irqdevice abstraction
> without introducing the concept of the in-kernel LAPIC. It also provides an
> implementation of a "userint" model, which should support a system with a
> QEMU based emulation of the (A)PIC (e.g. current behavior). I have cleaned
> up quite a few things based on all of your comments as well, so hopefully its
> a little more polished than last time.
>
> As far as testing, I have confirmed that I can boot a 64-bit linux guest with
> no discernable difference in behavior.
>
> Note that this patch applies after the in-kernel-mmio.patch that I sent
> earlier today which has not yet been accepted/commited. This patch should
> not depend on it. However, if you do have problems applying it let me know
> and I can generate another one that is not after in-kernel-mmio in the series.
>
A patchset is fine.
> --- a/drivers/kvm/Makefile
> +++ b/drivers/kvm/Makefile
> @@ -2,7 +2,7 @@
> # Makefile for Kernel-based Virtual Machine module
> #
>
> -kvm-objs := kvm_main.o mmu.o x86_emulate.o
> +kvm-objs := kvm_main.o mmu.o x86_emulate.o kvm_userint.o
>
kvm_ prefixes are not needed for drivers/kvm/. kvm_main is an
historical exception.
> obj-$(CONFIG_KVM) += kvm.o
> kvm-intel-objs = vmx.o
> obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
> diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
> index c1923df..6caf60b 100644
> --- a/drivers/kvm/kvm.h
> +++ b/drivers/kvm/kvm.h
> @@ -13,6 +13,7 @@
> #include <linux/mm.h>
>
> #include "vmx.h"
> +#include "kvm_irqdevice.h"
>
Ditto.
> +
> +#ifndef __KVM_IRQDEVICE_H
> +#define __KVM_IRQDEVICE_H
> +
> +#define KVM_IRQFLAGS_NMI (1 << 0)
> +#define KVM_IRQFLAGS_PEEK (1 << 1)
>
is PEEK still needed?
> +
> +struct kvm_irqdevice;
> +
> +struct kvm_irqsink {
> + void (*raise_intr)(struct kvm_irqsink *this,
> + struct kvm_irqdevice *dev);
> +
> + void *private;
> +};
> +
> +struct kvm_irqdevice {
> + int (*pending)(struct kvm_irqdevice *this, int flags);
> + int (*read_vector)(struct kvm_irqdevice *this, int flags);
>
I'm a bit confused here. The kvm_irqsink mechanism implies a push
mechanism. ->pending and ->read_vector imply a pull mechanism.
> +
> +/*
> + * kvm_irq_init()
> + *
> + * Description:
> + * Initialized the kvm_irqdevice for use. Should be called before calling
> + * any derived implementation init functions
> + *
> + * Parameters:
> + * struct kvm_irqdevice *dev: This device
> + *
> + * Returns: (void)
> + */
>
This is infinitely better than the usual kvm documentation, but it would
be better still to follow kerneldoc.
> +static inline void kvm_irq_init(struct kvm_irqdevice *dev)
> +{
> + memset(dev, 0, sizeof(*dev));
> +}
> +
> +/*
> + * kvm_irq_pending()
> + *
> + * Description:
> + * Efficiently determines if an interrupt is pending on an irqdevice
> + *
> + * Parameters:
> + * struct kvm_irqdevice *dev: The device
> + * int flags: Modifies the behavior as follows:
> + *
> + * KVM_IRQFLAGS_NMI: Mask everything but NMIs
> + *
> + * Returns: (int)
> + * -1 = failure
> + * 0 = no iterrupts pending (per "flags" criteria)
> + * >0 = one or more interrupts are pending
> + */
> +static inline int kvm_irq_pending(struct kvm_irqdevice *dev, int flags)
> +{
> + return dev->pending(dev, flags);
> +}
>
Usually -errno is returned for failure. Can this reasonably fail,
though? it would simplify upper layers if it couldn't.
> +
> +/*
> + * kvm_irq_set_pin()
> + *
> + * Description:
> + * Allows the caller to assert/deassert an IRQ input pin to the device
> + * according to device policy.
> + *
> + * Parameters:
> + * struct kvm_irqdevice *dev: The device
> + * int pin: The input pin to alter
> + * int level: The value to set (1 = assert, 0 = deassert)
> + * int flags: Reserved, must be 0
>
Just remove it for now. When we find a use for it, we can add it. It's
only userspace interfaces that are brittle.
> + /* walk the interrupt-bitmap and inject an IRQ for each bit found */
> + for (i = 0; i < NR_IRQ_WORDS; ++i) {
> + unsigned long word = sregs->interrupt_bitmap[i];
> + while(word) {
>
spaceafterwhile. Also, can just use a single loop with __test_bit(), as
we're not performance critical here.
> + int bit_index = __ffs(word);
> + int irq = i * BITS_PER_LONG + bit_index;
> +
> + kvm_irq_set_pin(&vcpu->irq_dev, irq, 1, 0);
> +
> + clear_bit(bit_index, &word);
> + }
> + }
>
> set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
> set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
> @@ -2318,6 +2317,33 @@ out1:
>
> diff --git a/drivers/kvm/kvm_userint.c b/drivers/kvm/kvm_userint.c
> new file mode 100644
> index 0000000..ded3098
> --- /dev/null
> +++ b/drivers/kvm/kvm_userint.c
> @@ -0,0 +1,201 @@
> +/*
> + * kvm_userint.c: User Interrupts IRQ device
> + *
> + * This acts as an extention of an interrupt controller that exists
> elsewhere
> + * (typically in userspace/QEMU). Because this PIC is a pseudo device that
> + * is downstream from a real emulated PIC, the "IRQ-to-vector" mapping has
> + * already occured. Therefore, this PIC has the following unusal properties:
> + *
> + * 1) It has 256 "pins" which are literal vectors (e.g.no translation)
> + * 2) It only supports "auto-EOI" behavior since it is expected that the
> + * upstream emulated PIC will handle the real EOIs (if applicable)
> + * 3) It only listens to "asserts" on the pins (deasserts are dropped)
> + * because its an auto-EOI device anyway.
>
Wierd, but well explained.
> +static inline int bitarray_findhighest(struct bitarray *this)
> +{
> + if (!this->summary)
> + return -1;
> + else {
> + int word_index = __ffs(this->summary);
> + int bit_index = __ffs(this->pending[word_index]);
> +
> + return word_index * BITS_PER_LONG + bit_index;
> + }
> +}
>
Um, this is the lowest?
> +
> +/*----------------------------------------------------------------------
> + * userint interface - provides the actual kvm_irqdevice implementation
> + *---------------------------------------------------------------------*/
> +
> +typedef struct {
> + spinlock_t lock;
> + struct bitarray irq_pending;
> + int nmi_pending;
> +}kvm_userint;
>
No typedefs in kernel code for ordinary structs.
> +
> +static int userint_pending(struct kvm_irqdevice *this, int flags)
> +{
> + kvm_userint *s = (kvm_userint*)this->private;
> + int ret;
> +
> + spin_lock_irq(&s->lock);
> +
> + if (flags & KVM_IRQFLAGS_NMI)
> + ret = s->nmi_pending;
> + else
> + ret = bitarray_pending(&s->irq_pending);
>
You probably want:
ret = s->nmi_pending;
if (!(flags & KVM_IRQFLAGS_NMI))
ret |= bitarray_pending(...);
To avoid calling this twice when interrupts are enabled.
> +static int userint_read_vector(struct kvm_irqdevice *this, int flags)
> +{
> + kvm_userint *s = (kvm_userint*)this->private;
> + int irq;
> +
> + spin_lock_irq(&s->lock);
> +
> + /* NMIs take priority, so if there is an NMI pending, or
> + * if we are filtering out NMIs, only consider them
> + */
> + if (s->nmi_pending || (flags & KVM_IRQFLAGS_NMI))
> + irq = s->nmi_pending ? 2 : -1;
> + else
> + irq = bitarray_findhighest(&s->irq_pending);
>
Same here. Actually the comment is correct (even though it does not
start with a /* on a line by itself).
In general, looks very good. The interface is a bit over-rich
(->pending, ->read_vector, and irq_sink) but that's not a showstopper.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel