Re: KVM devices assignment; PCIe AER?

2010-10-28 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

KVM already has an internal IRQ ACK notifier (which is what current
device assignment uses to do the same thing), it's just a matter of
adding a callback that does a kvm_register_irq_ack_notifier that sends
off the eventfd signal.  I've got this working and will probably send
out the KVM patch this week.  For now the eventfd goes to userspace, but
this is where I imagine we could steal some of the irqfd code to make
VFIO consume the irqfd signal directly.  Thanks,


Thanks for the clarification. I must admit I was somewhat confuse about 
that irqfd mechanism until I realized that all it does is to consume an 
eventfd from kernel context (like you pointed out earlier...)
So from userspace I guess that it means that the same eventfd is going to be 
assigned to both VFIO and KVM right?


Going back to the original discussion, I think that devices assignment 
over VFIO is a great way to support PCIe AER for the assigned devices. I'm 
going to spend some time in that direction for sure. In the mean time I'll 
send some patches (shortly) that address the problem without any major 
surgery to the current implementation.


thanks,
-Etienne






--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

I've actually been working on porting the qemu vfio driver over to
qemu-kvm recently, and nearly have it working.  For MSI-X interrupts
I've ported to the common msix.c code, which abstracts how the interrupt
is actually sent to the guest.  I've also added irqfd support in the
msix mask notifier so MSI-X interrupts avoid bouncing through qemu.  MSI
support should work the same once Michael's msi.c is upstream.

For INTx interrupts, qemu_set_irq will also work with KVM (it has to or
all of the emulated drivers would break).  The problem is getting an EOI
back from the KVM kernel apic.  I'm currently working on code that adds
a new KVM ioctl to register an eventfd for the EOI, which then triggers
qemu-kvm to re-enable the interrupt.  My hope is that we can add irqfd
support to both of these paths, so INTx is injected directly from VFIO
into KVM, and VFIO can directly consume the KVM EOI.


OK let me try to understand what you've done (please correct me if I'm 
wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery. 
Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the 
assigned devices receive an IRQ it pass it directly to kvm without 
bouncing to userspace?


I'm not sure to understand the part where VFIO signal back the EIO to KVM?

Also, with your change do you think that VFIO can be keept generic?
Reason I'm asking is because we are plannig to use VFIO for some userspace 
drivers...



Because qemu device assignment is working on VFIO I'm making the
assumption that kvm iommu code can be entirely deprecated. Maybe I'm
totally wrong here?


Yes, VFIO makes no use of it.


Yes I'm wrong?

Thanks,
-Etienne
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Wed, 2010-10-27 at 11:23 -0700, Etienne Martineau wrote:
 On Wed, 27 Oct 2010, Alex Williamson wrote:
  I've actually been working on porting the qemu vfio driver over to
  qemu-kvm recently, and nearly have it working.  For MSI-X interrupts
  I've ported to the common msix.c code, which abstracts how the interrupt
  is actually sent to the guest.  I've also added irqfd support in the
  msix mask notifier so MSI-X interrupts avoid bouncing through qemu.  MSI
  support should work the same once Michael's msi.c is upstream.
 
  For INTx interrupts, qemu_set_irq will also work with KVM (it has to or
  all of the emulated drivers would break).  The problem is getting an EOI
  back from the KVM kernel apic.  I'm currently working on code that adds
  a new KVM ioctl to register an eventfd for the EOI, which then triggers
  qemu-kvm to re-enable the interrupt.  My hope is that we can add irqfd
  support to both of these paths, so INTx is injected directly from VFIO
  into KVM, and VFIO can directly consume the KVM EOI.
 
 OK let me try to understand what you've done (please correct me if I'm 
 wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery.

No, emulated devices trigger interrupts directly with qemu_set_irq.
irqfds are currently only used by vhost afaik, since it's being
interrupted externally, much like pass through devices are.

 Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the 
 assigned devices receive an IRQ it pass it directly to kvm without 
 bouncing to userspace?

Sort of.  When the VFIO device triggers an interrupt, we get notified
via the eventfd we've registered for that interrupt.  We can then call
qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
That much works today.  The irqfd mechanism is simply a way for KVM to
directly consume the eventfd and raise an interrupt via a pre-setup
vector.  That's yet to be implemented for INTx on VFIO, but should
mostly be a matter of connecting existing pieces together.  It's working
for MSI-X.

 I'm not sure to understand the part where VFIO signal back the EIO to KVM?

When VFIO sends an interrupt, it disables the physical device from
generating more interrupts (this is where VFIO requires PCI 2.3
compliant devices for the INTx disable bit int he status register).
When the guest services the interrupt, we can detect this by catching
the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
the device.  Wash, rinse, repeat.

To do this in qemu, I created a callback on the ioapic where drivers can
register for the interrupt they care about.  Since KVM moves the ioapic
into the kernel, we need to extend this into KVM and have yet another
eventfd mechanism.  It's possible that we could have the VFIO kernel
module also receive this eventfd, re-enabling interrupts on the device,
in much the same way as above.  I haven't tried this yet, but it should
just be a matter of creating another VFIO ioctl and stealing code from
the KVM irqfd setup.

 Also, with your change do you think that VFIO can be keept generic?
 Reason I'm asking is because we are plannig to use VFIO for some userspace 
 drivers...

Yes, none of this requires KVM specific modifications to VFIO.  VFIO is
still just triggering eventfds, and hopefully receiving one via an
irqfd-like mechanism for EOI.

  Because qemu device assignment is working on VFIO I'm making the
  assumption that kvm iommu code can be entirely deprecated. Maybe I'm
  totally wrong here?
 
  Yes, VFIO makes no use of it.
 
 Yes I'm wrong?

VFIO does not make any use of the KVM iommu code, or any of the KVM
device assignment ioctls for that matter.  Thanks,

Alex



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Etienne Martineau


On Wed, 27 Oct 2010, Alex Williamson wrote:

No, emulated devices trigger interrupts directly with qemu_set_irq.
irqfds are currently only used by vhost afaik, since it's being
interrupted externally, much like pass through devices are.


Fair enough. Thanks for the clarification.


Sort of.  When the VFIO device triggers an interrupt, we get notified
via the eventfd we've registered for that interrupt.  We can then call
qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
That much works today.


Understood but performance wise this is no good for KVM right?


The irqfd mechanism is simply a way for KVM to
directly consume the eventfd and raise an interrupt via a pre-setup
vector.  That's yet to be implemented for INTx on VFIO, but should
mostly be a matter of connecting existing pieces together.  It's working
for MSI-X.


OK, I was on the impression you already had irqfd 'connected' to KVM from 
VFIO... This is why I was asking about the nature of the changed in VFIO.



When VFIO sends an interrupt, it disables the physical device from
generating more interrupts (this is where VFIO requires PCI 2.3
compliant devices for the INTx disable bit int he status register).
When the guest services the interrupt, we can detect this by catching
the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
the device.  Wash, rinse, repeat.

To do this in qemu, I created a callback on the ioapic where drivers can
register for the interrupt they care about.  Since KVM moves the ioapic
into the kernel, we need to extend this into KVM and have yet another
eventfd mechanism.  It's possible that we could have the VFIO kernel
module also receive this eventfd, re-enabling interrupts on the device,
in much the same way as above.


In the cases of KVM where are you going to catch the EIO? For some 
reason I'm on the impression that this is part of KVM. If so then how are 
you going to 'signal' to VFIO? Cannot use eventfd here right?



Yes, none of this requires KVM specific modifications to VFIO.  VFIO is
still just triggering eventfds, and hopefully receiving one via an
irqfd-like mechanism for EOI.


Thanks for your reply.
-Etienne

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
 On Wed, 27 Oct 2010, Alex Williamson wrote:
  No, emulated devices trigger interrupts directly with qemu_set_irq.
  irqfds are currently only used by vhost afaik, since it's being
  interrupted externally, much like pass through devices are.
 
 Fair enough. Thanks for the clarification.
 
  Sort of.  When the VFIO device triggers an interrupt, we get notified
  via the eventfd we've registered for that interrupt.  We can then call
  qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
  That much works today.
 
 Understood but performance wise this is no good for KVM right?

Right, bouncing interrupts and EOIs through qemu via eventfds is going
to add latency.  On the interrupt path we already have irqfds, which
will avoid the bounce through userspace, we just need to use them.
Doing something similar with EOIs could avoid that path, giving us
something comparable to current device assignment.

  The irqfd mechanism is simply a way for KVM to
  directly consume the eventfd and raise an interrupt via a pre-setup
  vector.  That's yet to be implemented for INTx on VFIO, but should
  mostly be a matter of connecting existing pieces together.  It's working
  for MSI-X.
 
 OK, I was on the impression you already had irqfd 'connected' to KVM from 
 VFIO... This is why I was asking about the nature of the changed in VFIO.
 
  When VFIO sends an interrupt, it disables the physical device from
  generating more interrupts (this is where VFIO requires PCI 2.3
  compliant devices for the INTx disable bit int he status register).
  When the guest services the interrupt, we can detect this by catching
  the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
  the device.  Wash, rinse, repeat.
 
  To do this in qemu, I created a callback on the ioapic where drivers can
  register for the interrupt they care about.  Since KVM moves the ioapic
  into the kernel, we need to extend this into KVM and have yet another
  eventfd mechanism.  It's possible that we could have the VFIO kernel
  module also receive this eventfd, re-enabling interrupts on the device,
  in much the same way as above.
 
 In the cases of KVM where are you going to catch the EIO? For some 
 reason I'm on the impression that this is part of KVM. If so then how are 
 you going to 'signal' to VFIO? Cannot use eventfd here right?

KVM already has an internal IRQ ACK notifier (which is what current
device assignment uses to do the same thing), it's just a matter of
adding a callback that does a kvm_register_irq_ack_notifier that sends
off the eventfd signal.  I've got this working and will probably send
out the KVM patch this week.  For now the eventfd goes to userspace, but
this is where I imagine we could steal some of the irqfd code to make
VFIO consume the irqfd signal directly.  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Michael S. Tsirkin
On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
 On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
  On Wed, 27 Oct 2010, Alex Williamson wrote:
   No, emulated devices trigger interrupts directly with qemu_set_irq.
   irqfds are currently only used by vhost afaik, since it's being
   interrupted externally, much like pass through devices are.
  
  Fair enough. Thanks for the clarification.
  
   Sort of.  When the VFIO device triggers an interrupt, we get notified
   via the eventfd we've registered for that interrupt.  We can then call
   qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
   That much works today.
  
  Understood but performance wise this is no good for KVM right?
 
 Right, bouncing interrupts and EOIs through qemu via eventfds is going
 to add latency.  On the interrupt path we already have irqfds, which
 will avoid the bounce through userspace, we just need to use them.
 Doing something similar with EOIs could avoid that path, giving us
 something comparable to current device assignment.
 
   The irqfd mechanism is simply a way for KVM to
   directly consume the eventfd and raise an interrupt via a pre-setup
   vector.  That's yet to be implemented for INTx on VFIO, but should
   mostly be a matter of connecting existing pieces together.  It's working
   for MSI-X.
  
  OK, I was on the impression you already had irqfd 'connected' to KVM from 
  VFIO... This is why I was asking about the nature of the changed in VFIO.
  
   When VFIO sends an interrupt, it disables the physical device from
   generating more interrupts (this is where VFIO requires PCI 2.3
   compliant devices for the INTx disable bit int he status register).
   When the guest services the interrupt, we can detect this by catching
   the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
   the device.  Wash, rinse, repeat.
  
   To do this in qemu, I created a callback on the ioapic where drivers can
   register for the interrupt they care about.  Since KVM moves the ioapic
   into the kernel, we need to extend this into KVM and have yet another
   eventfd mechanism.  It's possible that we could have the VFIO kernel
   module also receive this eventfd, re-enabling interrupts on the device,
   in much the same way as above.
  
  In the cases of KVM where are you going to catch the EIO? For some 
  reason I'm on the impression that this is part of KVM. If so then how are 
  you going to 'signal' to VFIO? Cannot use eventfd here right?
 
 KVM already has an internal IRQ ACK notifier (which is what current
 device assignment uses to do the same thing), it's just a matter of
 adding a callback that does a kvm_register_irq_ack_notifier that sends
 off the eventfd signal.  I've got this working and will probably send
 out the KVM patch this week.  For now the eventfd goes to userspace, but
 this is where I imagine we could steal some of the irqfd code to make
 VFIO consume the irqfd signal directly.  Thanks,
 
 Alex

BTW, how do we handle sharing the interrupt in guest?

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Alex Williamson
On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
 On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
  On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
   On Wed, 27 Oct 2010, Alex Williamson wrote:
No, emulated devices trigger interrupts directly with qemu_set_irq.
irqfds are currently only used by vhost afaik, since it's being
interrupted externally, much like pass through devices are.
   
   Fair enough. Thanks for the clarification.
   
Sort of.  When the VFIO device triggers an interrupt, we get notified
via the eventfd we've registered for that interrupt.  We can then call
qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
That much works today.
   
   Understood but performance wise this is no good for KVM right?
  
  Right, bouncing interrupts and EOIs through qemu via eventfds is going
  to add latency.  On the interrupt path we already have irqfds, which
  will avoid the bounce through userspace, we just need to use them.
  Doing something similar with EOIs could avoid that path, giving us
  something comparable to current device assignment.
  
The irqfd mechanism is simply a way for KVM to
directly consume the eventfd and raise an interrupt via a pre-setup
vector.  That's yet to be implemented for INTx on VFIO, but should
mostly be a matter of connecting existing pieces together.  It's working
for MSI-X.
   
   OK, I was on the impression you already had irqfd 'connected' to KVM from 
   VFIO... This is why I was asking about the nature of the changed in VFIO.
   
When VFIO sends an interrupt, it disables the physical device from
generating more interrupts (this is where VFIO requires PCI 2.3
compliant devices for the INTx disable bit int he status register).
When the guest services the interrupt, we can detect this by catching
the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
the device.  Wash, rinse, repeat.
   
To do this in qemu, I created a callback on the ioapic where drivers can
register for the interrupt they care about.  Since KVM moves the ioapic
into the kernel, we need to extend this into KVM and have yet another
eventfd mechanism.  It's possible that we could have the VFIO kernel
module also receive this eventfd, re-enabling interrupts on the device,
in much the same way as above.
   
   In the cases of KVM where are you going to catch the EIO? For some 
   reason I'm on the impression that this is part of KVM. If so then how are 
   you going to 'signal' to VFIO? Cannot use eventfd here right?
  
  KVM already has an internal IRQ ACK notifier (which is what current
  device assignment uses to do the same thing), it's just a matter of
  adding a callback that does a kvm_register_irq_ack_notifier that sends
  off the eventfd signal.  I've got this working and will probably send
  out the KVM patch this week.  For now the eventfd goes to userspace, but
  this is where I imagine we could steal some of the irqfd code to make
  VFIO consume the irqfd signal directly.  Thanks,
  
  Alex
 
 BTW, how do we handle sharing the interrupt in guest?

I'm currently using flags to track whether we've asserted the interrupt
in qemu, and only act on the eoi when the flag is set.  In my current
setup, the guest puts the pass through device and USB on the same
interrupt and using this filtering seems to be sufficient.  I think this
should act just like bare metal, the device will reassert the interrupt
if it still needs service, but we can avoid obviously gratuitous eois
being passed down to vfio.

This will complicate having vfio intercept the eoi eventfd directly
since it will then need to track the state too.  Another thing I've got
working is letting vfio support older non-PCI-2.3 compliant devices so
long as they can claim an exclusive interrupt (just like current code).
We need to track whether the irq is enabled or disabled for this anyway
so that we don't get unbalanced enabled/disables.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-27 Thread Michael S. Tsirkin
On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote:
 On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote:
  On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote:
   On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote:
On Wed, 27 Oct 2010, Alex Williamson wrote:
 No, emulated devices trigger interrupts directly with qemu_set_irq.
 irqfds are currently only used by vhost afaik, since it's being
 interrupted externally, much like pass through devices are.

Fair enough. Thanks for the clarification.

 Sort of.  When the VFIO device triggers an interrupt, we get notified
 via the eventfd we've registered for that interrupt.  We can then call
 qemu_set_irq directly to raise that interrupt in the KVM kernel APIC.
 That much works today.

Understood but performance wise this is no good for KVM right?
   
   Right, bouncing interrupts and EOIs through qemu via eventfds is going
   to add latency.  On the interrupt path we already have irqfds, which
   will avoid the bounce through userspace, we just need to use them.
   Doing something similar with EOIs could avoid that path, giving us
   something comparable to current device assignment.
   
 The irqfd mechanism is simply a way for KVM to
 directly consume the eventfd and raise an interrupt via a pre-setup
 vector.  That's yet to be implemented for INTx on VFIO, but should
 mostly be a matter of connecting existing pieces together.  It's 
 working
 for MSI-X.

OK, I was on the impression you already had irqfd 'connected' to KVM 
from 
VFIO... This is why I was asking about the nature of the changed in 
VFIO.

 When VFIO sends an interrupt, it disables the physical device from
 generating more interrupts (this is where VFIO requires PCI 2.3
 compliant devices for the INTx disable bit int he status register).
 When the guest services the interrupt, we can detect this by catching
 the EOI of the IOAPIC.  At that point, we can re-eanble interrupts on
 the device.  Wash, rinse, repeat.

 To do this in qemu, I created a callback on the ioapic where drivers 
 can
 register for the interrupt they care about.  Since KVM moves the 
 ioapic
 into the kernel, we need to extend this into KVM and have yet another
 eventfd mechanism.  It's possible that we could have the VFIO kernel
 module also receive this eventfd, re-enabling interrupts on the 
 device,
 in much the same way as above.

In the cases of KVM where are you going to catch the EIO? For some 
reason I'm on the impression that this is part of KVM. If so then how 
are 
you going to 'signal' to VFIO? Cannot use eventfd here right?
   
   KVM already has an internal IRQ ACK notifier (which is what current
   device assignment uses to do the same thing), it's just a matter of
   adding a callback that does a kvm_register_irq_ack_notifier that sends
   off the eventfd signal.  I've got this working and will probably send
   out the KVM patch this week.  For now the eventfd goes to userspace, but
   this is where I imagine we could steal some of the irqfd code to make
   VFIO consume the irqfd signal directly.  Thanks,
   
   Alex
  
  BTW, how do we handle sharing the interrupt in guest?
 
 I'm currently using flags to track whether we've asserted the interrupt
 in qemu, and only act on the eoi when the flag is set.  In my current
 setup, the guest puts the pass through device and USB on the same
 interrupt and using this filtering seems to be sufficient.  I think this
 should act just like bare metal, the device will reassert the interrupt
 if it still needs service, but we can avoid obviously gratuitous eois
 being passed down to vfio.
 
 This will complicate having vfio intercept the eoi eventfd directly
 since it will then need to track the state too.  Another thing I've got
 working is letting vfio support older non-PCI-2.3 compliant devices so
 long as they can claim an exclusive interrupt (just like current code).
 We need to track whether the irq is enabled or disabled for this anyway
 so that we don't get unbalanced enabled/disables.
 
 Alex

Tracking state is also good for saving an extra config read
on each access.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Alex Williamson
On Tue, Oct 26, 2010 at 10:41 AM, etmartin101 etmartin...@gmail.com wrote:
 Chris, Michael et al.

 Part of the project I'm working on, we are looking at extending the
 device assignment capabilities to provide support for PCIe AER.
 Ideally, the host would register for AER (for every assigned devices) and
 pass them up to Qemu.

 As of now, one of the problem is that KVM is not a driver for the
 assigned devices. I've seen Chris's slides from KVM conf 2010 but I
 haven't seen any patches or discussion on that topic...

 On another front, I've seen the work from Michael around 'uio_pci_generic'
 and some of his comments:
   It's expected that more features of interest to virtualization will be
  added to this driver in the future. Possibilities are: mmap for device
  resources, MSI/MSI-X, eventfd (to interface with kvm), iommu.

 I think that extending 'uio_pci_generic' to support AER is relatively
 straight forward (assuming eventfd support from UIO).

A lot of work has been going into a new VFIO driver that has taken
over in the virtualization direction that uio_pci_generic was starting
to go.  Search the mailing list for it.  That's probably where we'd
want to focus AER efforts.  I posted a qemu VFIO driver RFC a few
months ago and I'm still actively working on it.  We currently pass
the device capabilities to the guest via an emulation layer in the
VFIO kernel driver.  This layer might be going away as we try to
simplify the kernel side, which would mean more capability work in
qemu.  We also need the update to the Q35 chipset that Isaku is
working on to be able to have an express root port to signal AER
events.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* etmartin101 (etmartin...@gmail.com) wrote:
 Part of the project I'm working on, we are looking at extending the
 device assignment capabilities to provide support for PCIe AER.

How did you plan to do this?  Right now we only provide PCI capabilities
(not Extended Capabilities).

 Ideally, the host would register for AER (for every assigned
 devices) and pass them up to Qemu.

This should be so for devices iff they support AER to begin with.

Having proper chipset support in QEMU (PCIe + AER) also a step.

 As of now, one of the problem is that KVM is not a driver for the
 assigned devices. I've seen Chris's slides from KVM conf 2010 but I
 haven't seen any patches or discussion on that topic...
 
 On another front, I've seen the work from Michael around 'uio_pci_generic'
 and some of his comments:
It's expected that more features of interest to virtualization will be
   added to this driver in the future. Possibilities are: mmap for device
   resources, MSI/MSI-X, eventfd (to interface with kvm), iommu.
 
 I think that extending 'uio_pci_generic' to support AER is
 relatively straight forward (assuming eventfd support from UIO).

Did you look at VFIO?

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:
 Chris, Michael et al.
 
 Part of the project I'm working on, we are looking at extending the
 device assignment capabilities to provide support for PCIe AER.
 Ideally, the host would register for AER (for every assigned
 devices) and pass them up to Qemu.
 
 As of now, one of the problem is that KVM is not a driver for the
 assigned devices. I've seen Chris's slides from KVM conf 2010 but I
 haven't seen any patches or discussion on that topic...
 
 On another front, I've seen the work from Michael around 'uio_pci_generic'
 and some of his comments:
It's expected that more features of interest to virtualization will be
   added to this driver in the future. Possibilities are: mmap for device
   resources, MSI/MSI-X, eventfd (to interface with kvm), iommu.
 
 I think that extending 'uio_pci_generic' to support AER is
 relatively straight forward (assuming eventfd support from UIO).
 
 Any thought?
 
 thanks,
 -Etienne

On the qemu side, patches for aer support are being worked on,
pls take a look, likely there's some synergy there.

On the kernel side, AER for uio sounds fine, but the features listed
above are not yet supported, so you'd have to code them up too.
There's a uiommu driver from Tom Lyon that does the iommu bit.

Alex and Tom have been working on a VFIO driver which is not based on
the uio framework. You can try joining forces.

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:
 Chris, Michael et al.
 
 Part of the project I'm working on, we are looking at extending the
 device assignment capabilities to provide support for PCIe AER.

BTW, what's the motivation to support this feature?

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau

***Background context:
We are using kvm on a x86 based platform. For performance reason, we use 
devices assignment. The type of devices we have to deal with is mostly 
custom ASICs but we also have standard stuff (ex 82599EB 10-GigabitSR-IOV).


AER handling in guest VM is very important to us. Most of our drivers 
assume timely notification in cases of PCIe AER. Failure to provide such 
support will result in very undesireable behavior at the other end :)


In our cases, AER is not strictly function of the hardware 'quality' but 
also tied to the way the user can interact with the system (by pulling a 
card on the fly for example).


***Q35, VFIO:
I'm aware of the PCIe Q35 chipset work that Isaku is working on. I have a 
college that is working actively on merging the change into our qemu-kvm branch.


I'm also familiar with VFIO. I've seen the work from Alex on the qemu VFIO 
front. Not too sure how this is going to work on qemu-kvm? OR maybe it's 
because I'm confused about the long term plan for user space [qemu vs 
qemu-kvm] branch?


***One of the aspect I'm not clear is the strategy for device-assignment 
under KVM?

A) Move to VFIO; [/dev/iommu, /dev/vfio]
B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
C) No plan for short term

Alex, Chris, Michael thanks for your reply.
-Etienne

On Tue, 26 Oct 2010, Michael S. Tsirkin wrote:


On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote:

Chris, Michael et al.

Part of the project I'm working on, we are looking at extending the
device assignment capabilities to provide support for PCIe AER.
Ideally, the host would register for AER (for every assigned
devices) and pass them up to Qemu.

As of now, one of the problem is that KVM is not a driver for the
assigned devices. I've seen Chris's slides from KVM conf 2010 but I
haven't seen any patches or discussion on that topic...

On another front, I've seen the work from Michael around 'uio_pci_generic'
and some of his comments:
   It's expected that more features of interest to virtualization will be
  added to this driver in the future. Possibilities are: mmap for device
  resources, MSI/MSI-X, eventfd (to interface with kvm), iommu.

I think that extending 'uio_pci_generic' to support AER is
relatively straight forward (assuming eventfd support from UIO).

Any thought?

thanks,
-Etienne


On the qemu side, patches for aer support are being worked on,
pls take a look, likely there's some synergy there.

On the kernel side, AER for uio sounds fine, but the features listed
above are not yet supported, so you'd have to code them up too.
There's a uiommu driver from Tom Lyon that does the iommu bit.

Alex and Tom have been working on a VFIO driver which is not based on
the uio framework. You can try joining forces.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
 ***Background context:
 We are using kvm on a x86 based platform. For performance reason, we
 use devices assignment. The type of devices we have to deal with is
 mostly custom ASICs but we also have standard stuff (ex 82599EB
 10-GigabitSR-IOV).
 
 AER handling in guest VM is very important to us. Most of our
 drivers assume timely notification in cases of PCIe AER. Failure to
 provide such support will result in very undesireable behavior at
 the other end :)
 
 In our cases, AER is not strictly function of the hardware 'quality'
 but also tied to the way the user can interact with the system (by
 pulling a card on the fly for example).
 
 ***Q35, VFIO:
 I'm aware of the PCIe Q35 chipset work that Isaku is working on. I
 have a college that is working actively on merging the change into
 our qemu-kvm branch.
 
 I'm also familiar with VFIO. I've seen the work from Alex on the
 qemu VFIO front. Not too sure how this is going to work on qemu-kvm?
 OR maybe it's because I'm confused about the long term plan for user
 space [qemu vs qemu-kvm] branch?

The Plan (TM) is for qemu-kvm to be just a staging branch for qemu.
There are a few bits left in qemu-kvm (I'm sure you're aware since
you're working with them ;) that aren't yet upstream.  The device
assignment code is one of those bits.

Ideally, we could get KVM out of the business of being a device driver
for assigned host devices and move all of that functionality to VFIO.
And when looking at adding significant new functionality like AER,
scoping VFIO would be very useful.

 ***One of the aspect I'm not clear is the strategy for
 device-assignment under KVM?
 A) Move to VFIO; [/dev/iommu, /dev/vfio]

Long term, hopefully VFIO

 B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]

Short term (i.e. current qemu-kvm tree...this is what we have now and
will continue to until plan A) gets more mature).

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 01:24:00PM -0700, Etienne Martineau wrote:
 ***One of the aspect I'm not clear is the strategy for
 device-assignment under KVM?
 A) Move to VFIO; [/dev/iommu, /dev/vfio]
 B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
 C) No plan for short term
 
 Alex, Chris, Michael thanks for your reply.
 -Etienne

We will have to keep supporting B for a long while.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau



On Tue, 26 Oct 2010, Chris Wright wrote:


***One of the aspect I'm not clear is the strategy for
device-assignment under KVM?
A) Move to VFIO; [/dev/iommu, /dev/vfio]


Long term, hopefully VFIO


B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]


Short term (i.e. current qemu-kvm tree...this is what we have now and
will continue to until plan A) gets more mature).

Strictly speaking, I don't really agree with 'B' being the current 
implementation. Correct me if I'm wrong but for assigned devices, kvm 
does a look up for the device and eventually obtain a handle to it (struct 
pci_dev*) without doing a proper 'pci_register_driver'.


I think we need to register with PCI and provide 'pci_error_handlers' 
callback if we wants to receive AER notification.


In that context, do you think it's acceptable for KVM to be the driver of 
the assigned devices? OR should we simply add the AER logic into 
existing pci-stub and relay the information to user-space through 
eventfd...


Thanks,
Etienne


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
 
 
 On Tue, 26 Oct 2010, Chris Wright wrote:
 
 ***One of the aspect I'm not clear is the strategy for
 device-assignment under KVM?
 A) Move to VFIO; [/dev/iommu, /dev/vfio]
 
 Long term, hopefully VFIO
 
 B) KVM as a driver for the assigned devices; [sysfs/ ioctls..]
 
 Short term (i.e. current qemu-kvm tree...this is what we have now and
 will continue to until plan A) gets more mature).
 
 Strictly speaking, I don't really agree with 'B' being the current
 implementation. Correct me if I'm wrong but for assigned devices,
 kvm does a look up for the device and eventually obtain a handle to
 it (struct pci_dev*) without doing a proper 'pci_register_driver'.

OK, not a PCI driver per-se (that's pci-stub), but KVM owns the process of
registering interrupt handler, programming iommu, etc.

 I think we need to register with PCI and provide
 'pci_error_handlers' callback if we wants to receive AER
 notification.

Right, and adding more to the existing KVM code which we are hoping to
push to legacy support mode doesn't sound like a great idea.

 In that context, do you think it's acceptable for KVM to be the
 driver of the assigned devices? OR should we simply add the AER
 logic into existing pci-stub and relay the information to user-space
 through eventfd...

I'm reluctant to add logic to pci-stub, but VFIO should be able to
handle this directly.

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Michael S. Tsirkin
On Tue, Oct 26, 2010 at 03:15:58PM -0700, Chris Wright wrote:
  I think we need to register with PCI and provide
  'pci_error_handlers' callback if we wants to receive AER
  notification.
 
 Right, and adding more to the existing KVM code which we are hoping to
 push to legacy support mode doesn't sound like a great idea.

If it's a small patch, I think Avi mentioned that he might consider
features too.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau


On Tue, 26 Oct 2010, Chris Wright wrote:


Right, and adding more to the existing KVM code which we are hoping to
push to legacy support mode doesn't sound like a great idea.


I would totally agree with you if the alternative implementation to this 
legacy mode would be available in a relatively short time frame. I'm not 
sure about that?





In that context, do you think it's acceptable for KVM to be the
driver of the assigned devices? OR should we simply add the AER
logic into existing pci-stub and relay the information to user-space
through eventfd...


I'm reluctant to add logic to pci-stub, but VFIO should be able to
handle this directly.


I agree that VFIO should be able to do the job.

thanks,
-Etienne


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Chris Wright
* Etienne Martineau (etmartin...@gmail.com) wrote:
 On Tue, 26 Oct 2010, Chris Wright wrote:
 Right, and adding more to the existing KVM code which we are hoping to
 push to legacy support mode doesn't sound like a great idea.
 
 I would totally agree with you if the alternative implementation to
 this legacy mode would be available in a relatively short time
 frame. I'm not sure about that?

Depends on how quickly you can help whip it into shape ;)

That's why I asked about how you were implementing, for example, the AER
extended capability exposure.  Capabilities are a problem for the current
code (forget about extended capabilities, just regular capabilities).

 In that context, do you think it's acceptable for KVM to be the
 driver of the assigned devices? OR should we simply add the AER
 logic into existing pci-stub and relay the information to user-space
 through eventfd...
 
 I'm reluctant to add logic to pci-stub, but VFIO should be able to
 handle this directly.

 I agree that VFIO should be able to do the job.

It would be great to see some effort on this.

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM devices assignment; PCIe AER?

2010-10-26 Thread Etienne Martineau



On Tue, 26 Oct 2010, Chris Wright wrote:


I would totally agree with you if the alternative implementation to
this legacy mode would be available in a relatively short time
frame. I'm not sure about that?


Depends on how quickly you can help whip it into shape ;)


Humm, this is not an easy question;
IMHO the biggest problem of going with VFIO for devices assignment is 
interruption handling. VFIO knows how to signal an eventfd but doesn't 
know about 'kvm_set_irq' for example. I think that VFIO will require 
some adaptation for that matter.


Because qemu device assignment is working on VFIO I'm making the 
assumption that kvm iommu code can be entirely deprecated. Maybe I'm 
totally wrong here?



That's why I asked about how you were implementing, for example, the AER
extended capability exposure.  Capabilities are a problem for the current
code (forget about extended capabilities, just regular capabilities).


We depend on Q35 PCIe chipset for that matter. Do this answer your 
question?


thanks,
Etienne

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html