Re: KVM devices assignment; PCIe AER?

2010-10-28 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

KVM already has an internal IRQ ACK notifier (which is what current
device assignment uses to do the same thing), it's just a matter of
adding a callback that does a kvm_register_irq_ack_notifier that sends
off the eventfd signal.  I've got this working and will probably send
out the KVM patch this week.  For now the eventfd goes to userspace, but
this is where I imagine we could steal some of the irqfd code to make
VFIO consume the irqfd signal directly.  Thanks,


Thanks for the clarification. I must admit I was somewhat confuse about 
that irqfd mechanism until I realized that all it does is to consume an 
eventfd from kernel context (like you pointed out earlier...)
So from userspace I guess that it means that the same eventfd is going to be 
assigned to both VFIO and KVM right?


Going back to the original discussion, I think that devices assignment 
over VFIO is a great way to support PCIe AER for the assigned devices. I'm 
going to spend some time in that direction for sure. In the mean time I'll 
send some patches (shortly) that address the problem without any major 
surgery to the current implementation.


thanks,
-Etienne






--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Michael S. Tsirkin
On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote:
> On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
> > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
> > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> > > > On Wed, 27 Oct 2010, Alex Williamson wrote:
> > > > > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > > > > irqfds are currently only used by vhost afaik, since it's being
> > > > > interrupted externally, much like pass through devices are.
> > > > 
> > > > Fair enough. Thanks for the clarification.
> > > > 
> > > > > Sort of.  When the VFIO device triggers an interrupt, we get notified
> > > > > via the eventfd we've registered for that interrupt.  We can then call
> > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > > > > That much works today.
> > > > 
> > > > Understood but performance wise this is no good for KVM right?
> > > 
> > > Right, bouncing interrupts and EOIs through qemu via eventfds is going
> > > to add latency.  On the interrupt path we already have irqfds, which
> > > will avoid the bounce through userspace, we just need to use them.
> > > Doing something similar with EOIs could avoid that path, giving us
> > > something comparable to current device assignment.
> > > 
> > > > > The irqfd mechanism is simply a way for KVM to
> > > > > directly consume the eventfd and raise an interrupt via a pre-setup
> > > > > vector.  That's yet to be implemented for INTx on VFIO, but should
> > > > > mostly be a matter of connecting existing pieces together.  It's 
> > > > > working
> > > > > for MSI-X.
> > > > 
> > > > OK, I was on the impression you already had irqfd 'connected' to KVM 
> > > > from 
> > > > VFIO... This is why I was asking about the nature of the changed in 
> > > > VFIO.
> > > > 
> > > > > When VFIO sends an interrupt, it disables the physical device from
> > > > > generating more interrupts (this is where VFIO requires PCI 2.3
> > > > > compliant devices for the INTx disable bit int he status register).
> > > > > When the guest services the interrupt, we can detect this by catching
> > > > > the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
> > > > > the device.  Wash, rinse, repeat.
> > > > >
> > > > > To do this in qemu, I created a callback on the ioapic where drivers 
> > > > > can
> > > > > register for the interrupt they care about.  Since KVM moves the 
> > > > > ioapic
> > > > > into the kernel, we need to extend this into KVM and have yet another
> > > > > eventfd mechanism.  It's possible that we could have the VFIO kernel
> > > > > module also receive this eventfd, re-enabling interrupts on the 
> > > > > device,
> > > > > in much the same way as above.
> > > > 
> > > > In the cases of KVM where are you going to catch the EIO? For some 
> > > > reason I'm on the impression that this is part of KVM. If so then how 
> > > > are 
> > > > you going to 'signal' to VFIO? Cannot use eventfd here right?
> > > 
> > > KVM already has an internal IRQ ACK notifier (which is what current
> > > device assignment uses to do the same thing), it's just a matter of
> > > adding a callback that does a kvm_register_irq_ack_notifier that sends
> > > off the eventfd signal.  I've got this working and will probably send
> > > out the KVM patch this week.  For now the eventfd goes to userspace, but
> > > this is where I imagine we could steal some of the irqfd code to make
> > > VFIO consume the irqfd signal directly.  Thanks,
> > > 
> > > Alex
> > 
> > BTW, how do we handle sharing the interrupt in guest?
> 
> I'm currently using flags to track whether we've asserted the interrupt
> in qemu, and only act on the eoi when the flag is set.  In my current
> setup, the guest puts the pass through device and USB on the same
> interrupt and using this filtering seems to be sufficient.  I think this
> should act just like bare metal, the device will reassert the interrupt
> if it still needs service, but we can avoid obviously gratuitous eois
> being passed down to vfio.
> 
> This will complicate having vfio intercept the eoi eventfd directly
> since it will then need to track the state too.  Another thing I've got
> working is letting vfio support older non-PCI-2.3 compliant devices so
> long as they can claim an exclusive interrupt (just like current code).
> We need to track whether the irq is enabled or disabled for this anyway
> so that we don't get unbalanced enabled/disables.
> 
> Alex

Tracking state is also good for saving an extra config read
on each access.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
> On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
> > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> > > On Wed, 27 Oct 2010, Alex Williamson wrote:
> > > > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > > > irqfds are currently only used by vhost afaik, since it's being
> > > > interrupted externally, much like pass through devices are.
> > > 
> > > Fair enough. Thanks for the clarification.
> > > 
> > > > Sort of.  When the VFIO device triggers an interrupt, we get notified
> > > > via the eventfd we've registered for that interrupt.  We can then call
> > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > > > That much works today.
> > > 
> > > Understood but performance wise this is no good for KVM right?
> > 
> > Right, bouncing interrupts and EOIs through qemu via eventfds is going
> > to add latency.  On the interrupt path we already have irqfds, which
> > will avoid the bounce through userspace, we just need to use them.
> > Doing something similar with EOIs could avoid that path, giving us
> > something comparable to current device assignment.
> > 
> > > > The irqfd mechanism is simply a way for KVM to
> > > > directly consume the eventfd and raise an interrupt via a pre-setup
> > > > vector.  That's yet to be implemented for INTx on VFIO, but should
> > > > mostly be a matter of connecting existing pieces together.  It's working
> > > > for MSI-X.
> > > 
> > > OK, I was on the impression you already had irqfd 'connected' to KVM from 
> > > VFIO... This is why I was asking about the nature of the changed in VFIO.
> > > 
> > > > When VFIO sends an interrupt, it disables the physical device from
> > > > generating more interrupts (this is where VFIO requires PCI 2.3
> > > > compliant devices for the INTx disable bit int he status register).
> > > > When the guest services the interrupt, we can detect this by catching
> > > > the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
> > > > the device.  Wash, rinse, repeat.
> > > >
> > > > To do this in qemu, I created a callback on the ioapic where drivers can
> > > > register for the interrupt they care about.  Since KVM moves the ioapic
> > > > into the kernel, we need to extend this into KVM and have yet another
> > > > eventfd mechanism.  It's possible that we could have the VFIO kernel
> > > > module also receive this eventfd, re-enabling interrupts on the device,
> > > > in much the same way as above.
> > > 
> > > In the cases of KVM where are you going to catch the EIO? For some 
> > > reason I'm on the impression that this is part of KVM. If so then how are 
> > > you going to 'signal' to VFIO? Cannot use eventfd here right?
> > 
> > KVM already has an internal IRQ ACK notifier (which is what current
> > device assignment uses to do the same thing), it's just a matter of
> > adding a callback that does a kvm_register_irq_ack_notifier that sends
> > off the eventfd signal.  I've got this working and will probably send
> > out the KVM patch this week.  For now the eventfd goes to userspace, but
> > this is where I imagine we could steal some of the irqfd code to make
> > VFIO consume the irqfd signal directly.  Thanks,
> > 
> > Alex
> 
> BTW, how do we handle sharing the interrupt in guest?

I'm currently using flags to track whether we've asserted the interrupt
in qemu, and only act on the eoi when the flag is set.  In my current
setup, the guest puts the pass through device and USB on the same
interrupt and using this filtering seems to be sufficient.  I think this
should act just like bare metal, the device will reassert the interrupt
if it still needs service, but we can avoid obviously gratuitous eois
being passed down to vfio.

This will complicate having vfio intercept the eoi eventfd directly
since it will then need to track the state too.  Another thing I've got
working is letting vfio support older non-PCI-2.3 compliant devices so
long as they can claim an exclusive interrupt (just like current code).
We need to track whether the irq is enabled or disabled for this anyway
so that we don't get unbalanced enabled/disables.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Michael S. Tsirkin
On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
> On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> > On Wed, 27 Oct 2010, Alex Williamson wrote:
> > > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > > irqfds are currently only used by vhost afaik, since it's being
> > > interrupted externally, much like pass through devices are.
> > 
> > Fair enough. Thanks for the clarification.
> > 
> > > Sort of.  When the VFIO device triggers an interrupt, we get notified
> > > via the eventfd we've registered for that interrupt.  We can then call
> > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > > That much works today.
> > 
> > Understood but performance wise this is no good for KVM right?
> 
> Right, bouncing interrupts and EOIs through qemu via eventfds is going
> to add latency.  On the interrupt path we already have irqfds, which
> will avoid the bounce through userspace, we just need to use them.
> Doing something similar with EOIs could avoid that path, giving us
> something comparable to current device assignment.
> 
> > > The irqfd mechanism is simply a way for KVM to
> > > directly consume the eventfd and raise an interrupt via a pre-setup
> > > vector.  That's yet to be implemented for INTx on VFIO, but should
> > > mostly be a matter of connecting existing pieces together.  It's working
> > > for MSI-X.
> > 
> > OK, I was on the impression you already had irqfd 'connected' to KVM from 
> > VFIO... This is why I was asking about the nature of the changed in VFIO.
> > 
> > > When VFIO sends an interrupt, it disables the physical device from
> > > generating more interrupts (this is where VFIO requires PCI 2.3
> > > compliant devices for the INTx disable bit int he status register).
> > > When the guest services the interrupt, we can detect this by catching
> > > the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
> > > the device.  Wash, rinse, repeat.
> > >
> > > To do this in qemu, I created a callback on the ioapic where drivers can
> > > register for the interrupt they care about.  Since KVM moves the ioapic
> > > into the kernel, we need to extend this into KVM and have yet another
> > > eventfd mechanism.  It's possible that we could have the VFIO kernel
> > > module also receive this eventfd, re-enabling interrupts on the device,
> > > in much the same way as above.
> > 
> > In the cases of KVM where are you going to catch the EIO? For some 
> > reason I'm on the impression that this is part of KVM. If so then how are 
> > you going to 'signal' to VFIO? Cannot use eventfd here right?
> 
> KVM already has an internal IRQ ACK notifier (which is what current
> device assignment uses to do the same thing), it's just a matter of
> adding a callback that does a kvm_register_irq_ack_notifier that sends
> off the eventfd signal.  I've got this working and will probably send
> out the KVM patch this week.  For now the eventfd goes to userspace, but
> this is where I imagine we could steal some of the irqfd code to make
> VFIO consume the irqfd signal directly.  Thanks,
> 
> Alex

BTW, how do we handle sharing the interrupt in guest?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
> On Wed, 27 Oct 2010, Alex Williamson wrote:
> > No, emulated devices trigger interrupts directly with qemu_set_irq.
> > irqfds are currently only used by vhost afaik, since it's being
> > interrupted externally, much like pass through devices are.
> 
> Fair enough. Thanks for the clarification.
> 
> > Sort of.  When the VFIO device triggers an interrupt, we get notified
> > via the eventfd we've registered for that interrupt.  We can then call
> > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
> > That much works today.
> 
> Understood but performance wise this is no good for KVM right?

Right, bouncing interrupts and EOIs through qemu via eventfds is going
to add latency.  On the interrupt path we already have irqfds, which
will avoid the bounce through userspace, we just need to use them.
Doing something similar with EOIs could avoid that path, giving us
something comparable to current device assignment.

> > The irqfd mechanism is simply a way for KVM to
> > directly consume the eventfd and raise an interrupt via a pre-setup
> > vector.  That's yet to be implemented for INTx on VFIO, but should
> > mostly be a matter of connecting existing pieces together.  It's working
> > for MSI-X.
> 
> OK, I was on the impression you already had irqfd 'connected' to KVM from 
> VFIO... This is why I was asking about the nature of the changed in VFIO.
> 
> > When VFIO sends an interrupt, it disables the physical device from
> > generating more interrupts (this is where VFIO requires PCI 2.3
> > compliant devices for the INTx disable bit int he status register).
> > When the guest services the interrupt, we can detect this by catching
> > the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
> > the device.  Wash, rinse, repeat.
> >
> > To do this in qemu, I created a callback on the ioapic where drivers can
> > register for the interrupt they care about.  Since KVM moves the ioapic
> > into the kernel, we need to extend this into KVM and have yet another
> > eventfd mechanism.  It's possible that we could have the VFIO kernel
> > module also receive this eventfd, re-enabling interrupts on the device,
> > in much the same way as above.
> 
> In the cases of KVM where are you going to catch the EIO? For some 
> reason I'm on the impression that this is part of KVM. If so then how are 
> you going to 'signal' to VFIO? Cannot use eventfd here right?

KVM already has an internal IRQ ACK notifier (which is what current
device assignment uses to do the same thing), it's just a matter of
adding a callback that does a kvm_register_irq_ack_notifier that sends
off the eventfd signal.  I've got this working and will probably send
out the KVM patch this week.  For now the eventfd goes to userspace, but
this is where I imagine we could steal some of the irqfd code to make
VFIO consume the irqfd signal directly.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

No, emulated devices trigger interrupts directly with qemu_set_irq.
irqfds are currently only used by vhost afaik, since it's being
interrupted externally, much like pass through devices are.


Fair enough. Thanks for the clarification.


Sort of.  When the VFIO device triggers an interrupt, we get notified
via the eventfd we've registered for that interrupt.  We can then call
qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
That much works today.


Understood but performance wise this is no good for KVM right?


The irqfd mechanism is simply a way for KVM to
directly consume the eventfd and raise an interrupt via a pre-setup
vector.  That's yet to be implemented for INTx on VFIO, but should
mostly be a matter of connecting existing pieces together.  It's working
for MSI-X.


OK, I was on the impression you already had irqfd 'connected' to KVM from 
VFIO... This is why I was asking about the nature of the changed in VFIO.



When VFIO sends an interrupt, it disables the physical device from
generating more interrupts (this is where VFIO requires PCI 2.3
compliant devices for the INTx disable bit int he status register).
When the guest services the interrupt, we can detect this by catching
the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
the device.  Wash, rinse, repeat.

To do this in qemu, I created a callback on the ioapic where drivers can
register for the interrupt they care about.  Since KVM moves the ioapic
into the kernel, we need to extend this into KVM and have yet another
eventfd mechanism.  It's possible that we could have the VFIO kernel
module also receive this eventfd, re-enabling interrupts on the device,
in much the same way as above.


In the cases of KVM where are you going to catch the EIO? For some 
reason I'm on the impression that this is part of KVM. If so then how are 
you going to 'signal' to VFIO? Cannot use eventfd here right?



Yes, none of this requires KVM specific modifications to VFIO.  VFIO is
still just triggering eventfds, and hopefully receiving one via an
irqfd-like mechanism for EOI.


Thanks for your reply.
-Etienne

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Wed, 2010-10-27 at 11:23 -0700, Etienne Martineau wrote:
> On Wed, 27 Oct 2010, Alex Williamson wrote:
> > I've actually been working on porting the qemu vfio driver over to
> > qemu-kvm recently, and nearly have it working.  For MSI-X interrupts
> > I've ported to the common msix.c code, which abstracts how the interrupt
> > is actually sent to the guest.  I've also added irqfd support in the
> > msix mask notifier so MSI-X interrupts avoid bouncing through qemu.  MSI
> > support should work the same once Michael's msi.c is upstream.
> >
> > For INTx interrupts, qemu_set_irq will also work with KVM (it has to or
> > all of the emulated drivers would break).  The problem is getting an EOI
> > back from the KVM kernel apic.  I'm currently working on code that adds
> > a new KVM ioctl to register an eventfd for the EOI, which then triggers
> > qemu-kvm to re-enable the interrupt.  My hope is that we can add irqfd
> > support to both of these paths, so INTx is injected directly from VFIO
> > into KVM, and VFIO can directly consume the KVM EOI.
> 
> OK let me try to understand what you've done (please correct me if I'm 
> wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery.

No, emulated devices trigger interrupts directly with qemu_set_irq.
irqfds are currently only used by vhost afaik, since it's being
interrupted externally, much like pass through devices are.

> Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the 
> assigned devices receive an IRQ it pass it directly to kvm without 
> bouncing to userspace?

Sort of.  When the VFIO device triggers an interrupt, we get notified
via the eventfd we've registered for that interrupt.  We can then call
qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
That much works today.  The irqfd mechanism is simply a way for KVM to
directly consume the eventfd and raise an interrupt via a pre-setup
vector.  That's yet to be implemented for INTx on VFIO, but should
mostly be a matter of connecting existing pieces together.  It's working
for MSI-X.

> I'm not sure to understand the part where VFIO signal back the EIO to KVM?

When VFIO sends an interrupt, it disables the physical device from
generating more interrupts (this is where VFIO requires PCI 2.3
compliant devices for the INTx disable bit int he status register).
When the guest services the interrupt, we can detect this by catching
the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
the device.  Wash, rinse, repeat.

To do this in qemu, I created a callback on the ioapic where drivers can
register for the interrupt they care about.  Since KVM moves the ioapic
into the kernel, we need to extend this into KVM and have yet another
eventfd mechanism.  It's possible that we could have the VFIO kernel
module also receive this eventfd, re-enabling interrupts on the device,
in much the same way as above.  I haven't tried this yet, but it should
just be a matter of creating another VFIO ioctl and stealing code from
the KVM irqfd setup.

> Also, with your change do you think that VFIO can be keept generic?
> Reason I'm asking is because we are plannig to use VFIO for some userspace 
> drivers...

Yes, none of this requires KVM specific modifications to VFIO.  VFIO is
still just triggering eventfds, and hopefully receiving one via an
irqfd-like mechanism for EOI.

> >> Because qemu device assignment is working on VFIO I'm making the
> >> assumption that kvm iommu code can be entirely deprecated. Maybe I'm
> >> totally wrong here?
> >
> > Yes, VFIO makes no use of it.
> 
> Yes I'm wrong?

VFIO does not make any use of the KVM iommu code, or any of the KVM
device assignment ioctls for that matter.  Thanks,

Alex



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

I've actually been working on porting the qemu vfio driver over to
qemu-kvm recently, and nearly have it working.  For MSI-X interrupts
I've ported to the common msix.c code, which abstracts how the interrupt
is actually sent to the guest.  I've also added irqfd support in the
msix mask notifier so MSI-X interrupts avoid bouncing through qemu.  MSI
support should work the same once Michael's msi.c is upstream.

For INTx interrupts, qemu_set_irq will also work with KVM (it has to or
all of the emulated drivers would break).  The problem is getting an EOI
back from the KVM kernel apic.  I'm currently working on code that adds
a new KVM ioctl to register an eventfd for the EOI, which then triggers
qemu-kvm to re-enable the interrupt.  My hope is that we can add irqfd
support to both of these paths, so INTx is injected directly from VFIO
into KVM, and VFIO can directly consume the KVM EOI.


OK let me try to understand what you've done (please correct me if I'm 
wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery. 
Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the 
assigned devices receive an IRQ it pass it directly to kvm without 
bouncing to userspace?


I'm not sure to understand the part where VFIO signal back the EIO to KVM?

Also, with your change do you think that VFIO can be keept generic?
Reason I'm asking is because we are plannig to use VFIO for some userspace 
drivers...



Because qemu device assignment is working on VFIO I'm making the
assumption that kvm iommu code can be entirely deprecated. Maybe I'm
totally wrong here?


Yes, VFIO makes no use of it.


Yes I'm wrong?

Thanks,
-Etienne
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau



On Tue, 26 Oct 2010, Chris Wright wrote:


I would totally agree with you if the alternative implementation to
this legacy mode would be available in a relatively short time
frame. I'm not sure about that?


Depends on how quickly you can help whip it into shape ;)


Humm, this is not an easy question;
IMHO the biggest problem of going with VFIO for devices assignment is 
interruption handling. VFIO knows how to signal an eventfd but doesn't 
know about 'kvm_set_irq' for example. I think that VFIO will require 
some adaptation for that matter.


Because qemu device assignment is working on VFIO I'm making the 
assumption that kvm iommu code can be entirely deprecated. Maybe I'm 
totally wrong here?



That's why I asked about how you were implementing, for example, the AER
extended capability exposure.  Capabilities are a problem for the current
code (forget about extended capabilities, just regular capabilities).


We depend on Q35 PCIe chipset for that matter. Do this answer your 
question?


thanks,
Etienne

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
> On Tue, 26 Oct 2010, Chris Wright wrote:
> >Right, and adding more to the existing KVM code which we are hoping to
> >push to legacy support mode doesn't sound like a great idea.
> 
> I would totally agree with you if the alternative implementation to
> this legacy mode would be available in a relatively short time
> frame. I'm not sure about that?

Depends on how quickly you can help whip it into shape ;)

That's why I asked about how you were implementing, for example, the AER
extended capability exposure.  Capabilities are a problem for the current
code (forget about extended capabilities, just regular capabilities).

> >>In that context, do you think it's acceptable for KVM to be the
> >>driver of the assigned devices? OR should we simply add the AER
> >>logic into existing pci-stub and relay the information to user-space
> >>through eventfd...
> >
> >I'm reluctant to add logic to pci-stub, but VFIO should be able to
> >handle this directly.
>
> I agree that VFIO should be able to do the job.

It would be great to see some effort on this.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau


On Tue, 26 Oct 2010, Chris Wright wrote:


Right, and adding more to the existing KVM code which we are hoping to
push to legacy support mode doesn't sound like a great idea.


I would totally agree with you if the alternative implementation to this 
legacy mode would be available in a relatively short time frame. I'm not 
sure about that?





In that context, do you think it's acceptable for KVM to be the
driver of the assigned devices? OR should we simply add the AER
logic into existing pci-stub and relay the information to user-space
through eventfd...


I'm reluctant to add logic to pci-stub, but VFIO should be able to
handle this directly.


I agree that VFIO should be able to do the job.

thanks,
-Etienne


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 03:15:58PM -0700, Chris Wright wrote:
> > I think we need to register with PCI and provide
> > 'pci_error_handlers' callback if we wants to receive AER
> > notification.
> 
> Right, and adding more to the existing KVM code which we are hoping to
> push to legacy support mode doesn't sound like a great idea.

If it's a small patch, I think Avi mentioned that he might consider
features too.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
> 
> 
> On Tue, 26 Oct 2010, Chris Wright wrote:
> 
> >>***One of the aspect I'm not clear is the strategy for
> >>device-assignment under KVM?
> >>A) Move to VFIO; [/dev/iommu, /dev/vfio]
> >
> >Long term, hopefully VFIO
> >
> >>B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
> >
> >Short term (i.e. current qemu-kvm tree...this is what we have now and
> >will continue to until plan A) gets more mature).
> >
> Strictly speaking, I don't really agree with 'B' being the current
> implementation. Correct me if I'm wrong but for assigned devices,
> kvm does a look up for the device and eventually obtain a handle to
> it (struct pci_dev*) without doing a proper 'pci_register_driver'.

OK, not a PCI driver per-se (that's pci-stub), but KVM owns the process of
registering interrupt handler, programming iommu, etc.

> I think we need to register with PCI and provide
> 'pci_error_handlers' callback if we wants to receive AER
> notification.

Right, and adding more to the existing KVM code which we are hoping to
push to legacy support mode doesn't sound like a great idea.

> In that context, do you think it's acceptable for KVM to be the
> driver of the assigned devices? OR should we simply add the AER
> logic into existing pci-stub and relay the information to user-space
> through eventfd...

I'm reluctant to add logic to pci-stub, but VFIO should be able to
handle this directly.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau



On Tue, 26 Oct 2010, Chris Wright wrote:


***One of the aspect I'm not clear is the strategy for
device-assignment under KVM?
A) Move to VFIO; [/dev/iommu, /dev/vfio]


Long term, hopefully VFIO


B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]


Short term (i.e. current qemu-kvm tree...this is what we have now and
will continue to until plan A) gets more mature).

Strictly speaking, I don't really agree with 'B' being the current 
implementation. Correct me if I'm wrong but for assigned devices, kvm 
does a look up for the device and eventually obtain a handle to it (struct 
pci_dev*) without doing a proper 'pci_register_driver'.


I think we need to register with PCI and provide 'pci_error_handlers' 
callback if we wants to receive AER notification.


In that context, do you think it's acceptable for KVM to be the driver of 
the assigned devices? OR should we simply add the AER logic into 
existing pci-stub and relay the information to user-space through 
eventfd...


Thanks,
Etienne


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 01:24:00PM -0700, Etienne Martineau wrote:
> ***One of the aspect I'm not clear is the strategy for
> device-assignment under KVM?
> A) Move to VFIO; [/dev/iommu, /dev/vfio]
> B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
> C) No plan for short term
> 
> Alex, Chris, Michael thanks for your reply.
> -Etienne

We will have to keep supporting B for a long while.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
> ***Background context:
> We are using kvm on a x86 based platform. For performance reason, we
> use devices assignment. The type of devices we have to deal with is
> mostly custom ASICs but we also have standard stuff (ex 82599EB
> 10-GigabitSR-IOV).
> 
> AER handling in guest VM is very important to us. Most of our
> drivers assume timely notification in cases of PCIe AER. Failure to
> provide such support will result in very undesireable behavior at
> the other end :)
> 
> In our cases, AER is not strictly function of the hardware 'quality'
> but also tied to the way the user can interact with the system (by
> pulling a card on the fly for example).
> 
> ***Q35, VFIO:
> I'm aware of the PCIe Q35 chipset work that Isaku is working on. I
> have a college that is working actively on merging the change into
> our qemu-kvm branch.
> 
> I'm also familiar with VFIO. I've seen the work from Alex on the
> qemu VFIO front. Not too sure how this is going to work on qemu-kvm?
> OR maybe it's because I'm confused about the long term plan for user
> space [qemu vs qemu-kvm] branch?

The Plan (TM) is for qemu-kvm to be just a staging branch for qemu.
There are a few bits left in qemu-kvm (I'm sure you're aware since
you're working with them ;) that aren't yet upstream.  The device
assignment code is one of those bits.

Ideally, we could get KVM out of the business of being a device driver
for assigned host devices and move all of that functionality to VFIO.
And when looking at adding significant new functionality like AER,
scoping VFIO would be very useful.

> ***One of the aspect I'm not clear is the strategy for
> device-assignment under KVM?
> A) Move to VFIO; [/dev/iommu, /dev/vfio]

Long term, hopefully VFIO

> B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]

Short term (i.e. current qemu-kvm tree...this is what we have now and
will continue to until plan A) gets more mature).

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau

***Background context:
We are using kvm on a x86 based platform. For performance reason, we use 
devices assignment. The type of devices we have to deal with is mostly 
custom ASICs but we also have standard stuff (ex 82599EB 10-GigabitSR-IOV).


AER handling in guest VM is very important to us. Most of our drivers 
assume timely notification in cases of PCIe AER. Failure to provide such 
support will result in very undesireable behavior at the other end :)


In our cases, AER is not strictly function of the hardware 'quality' but 
also tied to the way the user can interact with the system (by pulling a 
card on the fly for example).


***Q35, VFIO:
I'm aware of the PCIe Q35 chipset work that Isaku is working on. I have a 
college that is working actively on merging the change into our qemu-kvm branch.


I'm also familiar with VFIO. I've seen the work from Alex on the qemu VFIO 
front. Not too sure how this is going to work on qemu-kvm? OR maybe it's 
because I'm confused about the long term plan for user space [qemu vs 
qemu-kvm] branch?


***One of the aspect I'm not clear is the strategy for device-assignment 
under KVM?

A) Move to VFIO; [/dev/iommu, /dev/vfio]
B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
C) No plan for short term

Alex, Chris, Michael thanks for your reply.
-Etienne

On Tue, 26 Oct 2010, Michael S. Tsirkin wrote:


On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:

Chris, Michael et al.

Part of the project I'm working on, we are looking at extending the
device assignment capabilities to provide support for PCIe AER.
Ideally, the host would register for AER (for every assigned
devices) and pass them up to Qemu.

As of now, one of the problem is that KVM is not a driver for the
assigned devices. I've seen Chris's slides from KVM conf 2010 but I
haven't seen any patches or discussion on that topic...

On another front, I've seen the work from Michael around 'uio_pci_generic'
and some of his comments:
  " It's expected that more features of interest to virtualization will be
  added to this driver in the future. Possibilities are: mmap for device
  resources, MSI/MSI-X, eventfd (to interface with kvm), iommu."

I think that extending 'uio_pci_generic' to support AER is
relatively straight forward (assuming eventfd support from UIO).

Any thought?

thanks,
-Etienne


On the qemu side, patches for aer support are being worked on,
pls take a look, likely there's some synergy there.

On the kernel side, AER for uio sounds fine, but the features listed
above are not yet supported, so you'd have to code them up too.
There's a uiommu driver from Tom Lyon that does the iommu bit.

Alex and Tom have been working on a VFIO driver which is not based on
the uio framework. You can try joining forces.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:
> Chris, Michael et al.
> 
> Part of the project I'm working on, we are looking at extending the
> device assignment capabilities to provide support for PCIe AER.

BTW, what's the motivation to support this feature?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:
> Chris, Michael et al.
> 
> Part of the project I'm working on, we are looking at extending the
> device assignment capabilities to provide support for PCIe AER.
> Ideally, the host would register for AER (for every assigned
> devices) and pass them up to Qemu.
> 
> As of now, one of the problem is that KVM is not a driver for the
> assigned devices. I've seen Chris's slides from KVM conf 2010 but I
> haven't seen any patches or discussion on that topic...
> 
> On another front, I've seen the work from Michael around 'uio_pci_generic'
> and some of his comments:
>   " It's expected that more features of interest to virtualization will be
>   added to this driver in the future. Possibilities are: mmap for device
>   resources, MSI/MSI-X, eventfd (to interface with kvm), iommu."
> 
> I think that extending 'uio_pci_generic' to support AER is
> relatively straight forward (assuming eventfd support from UIO).
> 
> Any thought?
> 
> thanks,
> -Etienne

On the qemu side, patches for aer support are being worked on,
pls take a look, likely there's some synergy there.

On the kernel side, AER for uio sounds fine, but the features listed
above are not yet supported, so you'd have to code them up too.
There's a uiommu driver from Tom Lyon that does the iommu bit.

Alex and Tom have been working on a VFIO driver which is not based on
the uio framework. You can try joining forces.

> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* etmartin101 (etmartin...@gmail.com) wrote:
> Part of the project I'm working on, we are looking at extending the
> device assignment capabilities to provide support for PCIe AER.

How did you plan to do this?  Right now we only provide PCI capabilities
(not Extended Capabilities).

> Ideally, the host would register for AER (for every assigned
> devices) and pass them up to Qemu.

This should be so for devices iff they support AER to begin with.

Having proper chipset support in QEMU (PCIe + AER) also a step.

> As of now, one of the problem is that KVM is not a driver for the
> assigned devices. I've seen Chris's slides from KVM conf 2010 but I
> haven't seen any patches or discussion on that topic...
> 
> On another front, I've seen the work from Michael around 'uio_pci_generic'
> and some of his comments:
>   " It's expected that more features of interest to virtualization will be
>   added to this driver in the future. Possibilities are: mmap for device
>   resources, MSI/MSI-X, eventfd (to interface with kvm), iommu."
> 
> I think that extending 'uio_pci_generic' to support AER is
> relatively straight forward (assuming eventfd support from UIO).

Did you look at VFIO?

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Alex Williamson
On Tue, Oct 26, 2010 at 10:41 AM, etmartin101  wrote:
> Chris, Michael et al.
>
> Part of the project I'm working on, we are looking at extending the
> device assignment capabilities to provide support for PCIe AER.
> Ideally, the host would register for AER (for every assigned devices) and
> pass them up to Qemu.
>
> As of now, one of the problem is that KVM is not a driver for the
> assigned devices. I've seen Chris's slides from KVM conf 2010 but I
> haven't seen any patches or discussion on that topic...
>
> On another front, I've seen the work from Michael around 'uio_pci_generic'
> and some of his comments:
>  " It's expected that more features of interest to virtualization will be
>  added to this driver in the future. Possibilities are: mmap for device
>  resources, MSI/MSI-X, eventfd (to interface with kvm), iommu."
>
> I think that extending 'uio_pci_generic' to support AER is relatively
> straight forward (assuming eventfd support from UIO).

A lot of work has been going into a new "VFIO" driver that has taken
over in the virtualization direction that uio_pci_generic was starting
to go.  Search the mailing list for it.  That's probably where we'd
want to focus AER efforts.  I posted a qemu VFIO driver RFC a few
months ago and I'm still actively working on it.  We currently pass
the device capabilities to the guest via an emulation layer in the
VFIO kernel driver.  This layer might be going away as we try to
simplify the kernel side, which would mean more capability work in
qemu.  We also need the update to the Q35 chipset that Isaku is
working on to be able to have an express root port to signal AER
events.

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM devices assignment; PCIe AER?

2010-10-26 Thread etmartin101

Chris, Michael et al.

Part of the project I'm working on, we are looking at extending the
device assignment capabilities to provide support for PCIe AER.
Ideally, the host would register for AER (for every assigned devices) and 
pass them up to Qemu.


As of now, one of the problem is that KVM is not a driver for the
assigned devices. I've seen Chris's slides from KVM conf 2010 but I
haven't seen any patches or discussion on that topic...

On another front, I've seen the work from Michael around 'uio_pci_generic'
and some of his comments:
  " It's expected that more features of interest to virtualization will be
  added to this driver in the future. Possibilities are: mmap for device
  resources, MSI/MSI-X, eventfd (to interface with kvm), iommu."

I think that extending 'uio_pci_generic' to support AER is relatively 
straight forward (assuming eventfd support from UIO).


Any thought?

thanks,
-Etienne

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html