Alex, Michael,

Thank you for the clarification.

On Tue, Mar 15, 2011 at 1:01 AM, Alex Williamson <alex.william...@redhat.com
> wrote:

> On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> > On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > > Seeking clarification to the original question I posted:
> > > >>
> > > >>
> > > > This maybe a novice question - Would appreciate it if you can you
> provide a
> > > > pointer to documentation or relevant code that explains what is the
> > > > limitation in supporting level irq support in kvm irqfd.
> > > >
> > > >
> > > >
> > > After browsing the KVM kernel code, it does look like direct assignment
> of PCI
> > > devices allows support for level-triggered interrupts to be injected to
> the
> > > guest from the kernel.  (as opposed to not supporting it for vhost
> irqfd
> > > mechanism)
> > > This occurs when the guest device supports INTX.
> > > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
> calls
> > > kvm_set_irq()
> > > with the guest_irq.
> > > This function in turn invokes the assigned set function  (either
> > > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
> creation
> > > time when kvm_setup_default_irq_routing () called for handling ioctl
> > > KVM_CREATE_IRQCHIP.
> > >
> > > So, it isn't clear why level-triggered interrupt isn't supported for
> irqfd
> > > mechanism.
> > > Would greatly appreciate clarification here
> > >
> > > Thanks
> > > -Rukhsana
> > >
> >
> > Mostly, no one came up with an implementation so far.
> >
> > If the point is to use irqfd with vhost-net, there's also
> > a question of adding interfaces to
> > 1. pass IO read transactions directly to another kernel module
> > 2. add an interface to clear the irq level
> >
> > Maybe the right thing is to combine the two somehow:
> > irqfd might get an oiption to set a bit in memory,
> > ioeventfd might get an option to read and clear from memory
> > and clear irqfd line at the same time.
>
> I had wanted this for VFIO too and it gets pretty complicated.  The
> first problem with level triggered interrupts is that you need to know
> which GSI your device triggers.  This means translating PCI INTA through
> bridge swizzles and chipset mapping to an IOAPIC.  Current device
> assignment does this through a complete hack in qemu.  Then you can set
> the IRQ, but being level triggered, we need to know when the guest has
> serviced the IRQ so we can de-assert it.  This requires a hook into the
> in-kernel APIC to sent the EOI back out to userspace.
>
> I posted RFC patches for doing all this a while back, but they didn't go
> anywhere.  I think the feeling was that it was too intrusive for "slow"
> interrupts.  The current thinking for VFIO based device assignment is to
> use qemu for level interrupts until we find something that actually
> needs low latency in this path.  We generally consider INTx to be like
> supporting i/o port space or non-4k BARs, ie. necessary for
> compatibility, but not necessarily a performance path.  High performance
> devices should always be using some kind of MSI because it bypasses all
> of the APIC complications and slowness.  Thanks,
>
> Alex
>
>


-- 
-Rukhsana

Reply via email to