Re: KVM devices assignment; PCIe AER?
On Wed, 27 Oct 2010, Alex Williamson wrote: KVM already has an internal IRQ ACK notifier (which is what current device assignment uses to do the same thing), it's just a matter of adding a callback that does a kvm_register_irq_ack_notifier that sends off the eventfd signal. I've got this working and will probably send out the KVM patch this week. For now the eventfd goes to userspace, but this is where I imagine we could steal some of the irqfd code to make VFIO consume the irqfd signal directly. Thanks, Thanks for the clarification. I must admit I was somewhat confuse about that irqfd mechanism until I realized that all it does is to consume an eventfd from kernel context (like you pointed out earlier...) So from userspace I guess that it means that the same eventfd is going to be assigned to both VFIO and KVM right? Going back to the original discussion, I think that devices assignment over VFIO is a great way to support PCIe AER for the assigned devices. I'm going to spend some time in that direction for sure. In the mean time I'll send some patches (shortly) that address the problem without any major surgery to the current implementation. thanks, -Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote: > On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote: > > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote: > > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > > > > On Wed, 27 Oct 2010, Alex Williamson wrote: > > > > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > > > > irqfds are currently only used by vhost afaik, since it's being > > > > > interrupted externally, much like pass through devices are. > > > > > > > > Fair enough. Thanks for the clarification. > > > > > > > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > > > > via the eventfd we've registered for that interrupt. We can then call > > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > > > > That much works today. > > > > > > > > Understood but performance wise this is no good for KVM right? > > > > > > Right, bouncing interrupts and EOIs through qemu via eventfds is going > > > to add latency. On the interrupt path we already have irqfds, which > > > will avoid the bounce through userspace, we just need to use them. > > > Doing something similar with EOIs could avoid that path, giving us > > > something comparable to current device assignment. > > > > > > > > The irqfd mechanism is simply a way for KVM to > > > > > directly consume the eventfd and raise an interrupt via a pre-setup > > > > > vector. That's yet to be implemented for INTx on VFIO, but should > > > > > mostly be a matter of connecting existing pieces together. It's > > > > > working > > > > > for MSI-X. > > > > > > > > OK, I was on the impression you already had irqfd 'connected' to KVM > > > > from > > > > VFIO... This is why I was asking about the nature of the changed in > > > > VFIO. > > > > > > > > > When VFIO sends an interrupt, it disables the physical device from > > > > > generating more interrupts (this is where VFIO requires PCI 2.3 > > > > > compliant devices for the INTx disable bit int he status register). > > > > > When the guest services the interrupt, we can detect this by catching > > > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > > > > the device. Wash, rinse, repeat. > > > > > > > > > > To do this in qemu, I created a callback on the ioapic where drivers > > > > > can > > > > > register for the interrupt they care about. Since KVM moves the > > > > > ioapic > > > > > into the kernel, we need to extend this into KVM and have yet another > > > > > eventfd mechanism. It's possible that we could have the VFIO kernel > > > > > module also receive this eventfd, re-enabling interrupts on the > > > > > device, > > > > > in much the same way as above. > > > > > > > > In the cases of KVM where are you going to catch the EIO? For some > > > > reason I'm on the impression that this is part of KVM. If so then how > > > > are > > > > you going to 'signal' to VFIO? Cannot use eventfd here right? > > > > > > KVM already has an internal IRQ ACK notifier (which is what current > > > device assignment uses to do the same thing), it's just a matter of > > > adding a callback that does a kvm_register_irq_ack_notifier that sends > > > off the eventfd signal. I've got this working and will probably send > > > out the KVM patch this week. For now the eventfd goes to userspace, but > > > this is where I imagine we could steal some of the irqfd code to make > > > VFIO consume the irqfd signal directly. Thanks, > > > > > > Alex > > > > BTW, how do we handle sharing the interrupt in guest? > > I'm currently using flags to track whether we've asserted the interrupt > in qemu, and only act on the eoi when the flag is set. In my current > setup, the guest puts the pass through device and USB on the same > interrupt and using this filtering seems to be sufficient. I think this > should act just like bare metal, the device will reassert the interrupt > if it still needs service, but we can avoid obviously gratuitous eois > being passed down to vfio. > > This will complicate having vfio intercept the eoi eventfd directly > since it will then need to track the state too. Another thing I've got > working is letting vfio support older non-PCI-2.3 compliant devices so > long as they can claim an exclusive interrupt (just like current code). > We need to track whether the irq is enabled or disabled for this anyway > so that we don't get unbalanced enabled/disables. > > Alex Tracking state is also good for saving an extra config read on each access. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote: > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote: > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > > > On Wed, 27 Oct 2010, Alex Williamson wrote: > > > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > > > irqfds are currently only used by vhost afaik, since it's being > > > > interrupted externally, much like pass through devices are. > > > > > > Fair enough. Thanks for the clarification. > > > > > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > > > via the eventfd we've registered for that interrupt. We can then call > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > > > That much works today. > > > > > > Understood but performance wise this is no good for KVM right? > > > > Right, bouncing interrupts and EOIs through qemu via eventfds is going > > to add latency. On the interrupt path we already have irqfds, which > > will avoid the bounce through userspace, we just need to use them. > > Doing something similar with EOIs could avoid that path, giving us > > something comparable to current device assignment. > > > > > > The irqfd mechanism is simply a way for KVM to > > > > directly consume the eventfd and raise an interrupt via a pre-setup > > > > vector. That's yet to be implemented for INTx on VFIO, but should > > > > mostly be a matter of connecting existing pieces together. It's working > > > > for MSI-X. > > > > > > OK, I was on the impression you already had irqfd 'connected' to KVM from > > > VFIO... This is why I was asking about the nature of the changed in VFIO. > > > > > > > When VFIO sends an interrupt, it disables the physical device from > > > > generating more interrupts (this is where VFIO requires PCI 2.3 > > > > compliant devices for the INTx disable bit int he status register). > > > > When the guest services the interrupt, we can detect this by catching > > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > > > the device. Wash, rinse, repeat. > > > > > > > > To do this in qemu, I created a callback on the ioapic where drivers can > > > > register for the interrupt they care about. Since KVM moves the ioapic > > > > into the kernel, we need to extend this into KVM and have yet another > > > > eventfd mechanism. It's possible that we could have the VFIO kernel > > > > module also receive this eventfd, re-enabling interrupts on the device, > > > > in much the same way as above. > > > > > > In the cases of KVM where are you going to catch the EIO? For some > > > reason I'm on the impression that this is part of KVM. If so then how are > > > you going to 'signal' to VFIO? Cannot use eventfd here right? > > > > KVM already has an internal IRQ ACK notifier (which is what current > > device assignment uses to do the same thing), it's just a matter of > > adding a callback that does a kvm_register_irq_ack_notifier that sends > > off the eventfd signal. I've got this working and will probably send > > out the KVM patch this week. For now the eventfd goes to userspace, but > > this is where I imagine we could steal some of the irqfd code to make > > VFIO consume the irqfd signal directly. Thanks, > > > > Alex > > BTW, how do we handle sharing the interrupt in guest? I'm currently using flags to track whether we've asserted the interrupt in qemu, and only act on the eoi when the flag is set. In my current setup, the guest puts the pass through device and USB on the same interrupt and using this filtering seems to be sufficient. I think this should act just like bare metal, the device will reassert the interrupt if it still needs service, but we can avoid obviously gratuitous eois being passed down to vfio. This will complicate having vfio intercept the eoi eventfd directly since it will then need to track the state too. Another thing I've got working is letting vfio support older non-PCI-2.3 compliant devices so long as they can claim an exclusive interrupt (just like current code). We need to track whether the irq is enabled or disabled for this anyway so that we don't get unbalanced enabled/disables. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote: > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > > On Wed, 27 Oct 2010, Alex Williamson wrote: > > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > > irqfds are currently only used by vhost afaik, since it's being > > > interrupted externally, much like pass through devices are. > > > > Fair enough. Thanks for the clarification. > > > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > > via the eventfd we've registered for that interrupt. We can then call > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > > That much works today. > > > > Understood but performance wise this is no good for KVM right? > > Right, bouncing interrupts and EOIs through qemu via eventfds is going > to add latency. On the interrupt path we already have irqfds, which > will avoid the bounce through userspace, we just need to use them. > Doing something similar with EOIs could avoid that path, giving us > something comparable to current device assignment. > > > > The irqfd mechanism is simply a way for KVM to > > > directly consume the eventfd and raise an interrupt via a pre-setup > > > vector. That's yet to be implemented for INTx on VFIO, but should > > > mostly be a matter of connecting existing pieces together. It's working > > > for MSI-X. > > > > OK, I was on the impression you already had irqfd 'connected' to KVM from > > VFIO... This is why I was asking about the nature of the changed in VFIO. > > > > > When VFIO sends an interrupt, it disables the physical device from > > > generating more interrupts (this is where VFIO requires PCI 2.3 > > > compliant devices for the INTx disable bit int he status register). > > > When the guest services the interrupt, we can detect this by catching > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > > the device. Wash, rinse, repeat. > > > > > > To do this in qemu, I created a callback on the ioapic where drivers can > > > register for the interrupt they care about. Since KVM moves the ioapic > > > into the kernel, we need to extend this into KVM and have yet another > > > eventfd mechanism. It's possible that we could have the VFIO kernel > > > module also receive this eventfd, re-enabling interrupts on the device, > > > in much the same way as above. > > > > In the cases of KVM where are you going to catch the EIO? For some > > reason I'm on the impression that this is part of KVM. If so then how are > > you going to 'signal' to VFIO? Cannot use eventfd here right? > > KVM already has an internal IRQ ACK notifier (which is what current > device assignment uses to do the same thing), it's just a matter of > adding a callback that does a kvm_register_irq_ack_notifier that sends > off the eventfd signal. I've got this working and will probably send > out the KVM patch this week. For now the eventfd goes to userspace, but > this is where I imagine we could steal some of the irqfd code to make > VFIO consume the irqfd signal directly. Thanks, > > Alex BTW, how do we handle sharing the interrupt in guest? -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > On Wed, 27 Oct 2010, Alex Williamson wrote: > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > irqfds are currently only used by vhost afaik, since it's being > > interrupted externally, much like pass through devices are. > > Fair enough. Thanks for the clarification. > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > via the eventfd we've registered for that interrupt. We can then call > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > That much works today. > > Understood but performance wise this is no good for KVM right? Right, bouncing interrupts and EOIs through qemu via eventfds is going to add latency. On the interrupt path we already have irqfds, which will avoid the bounce through userspace, we just need to use them. Doing something similar with EOIs could avoid that path, giving us something comparable to current device assignment. > > The irqfd mechanism is simply a way for KVM to > > directly consume the eventfd and raise an interrupt via a pre-setup > > vector. That's yet to be implemented for INTx on VFIO, but should > > mostly be a matter of connecting existing pieces together. It's working > > for MSI-X. > > OK, I was on the impression you already had irqfd 'connected' to KVM from > VFIO... This is why I was asking about the nature of the changed in VFIO. > > > When VFIO sends an interrupt, it disables the physical device from > > generating more interrupts (this is where VFIO requires PCI 2.3 > > compliant devices for the INTx disable bit int he status register). > > When the guest services the interrupt, we can detect this by catching > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > the device. Wash, rinse, repeat. > > > > To do this in qemu, I created a callback on the ioapic where drivers can > > register for the interrupt they care about. Since KVM moves the ioapic > > into the kernel, we need to extend this into KVM and have yet another > > eventfd mechanism. It's possible that we could have the VFIO kernel > > module also receive this eventfd, re-enabling interrupts on the device, > > in much the same way as above. > > In the cases of KVM where are you going to catch the EIO? For some > reason I'm on the impression that this is part of KVM. If so then how are > you going to 'signal' to VFIO? Cannot use eventfd here right? KVM already has an internal IRQ ACK notifier (which is what current device assignment uses to do the same thing), it's just a matter of adding a callback that does a kvm_register_irq_ack_notifier that sends off the eventfd signal. I've got this working and will probably send out the KVM patch this week. For now the eventfd goes to userspace, but this is where I imagine we could steal some of the irqfd code to make VFIO consume the irqfd signal directly. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, 27 Oct 2010, Alex Williamson wrote: No, emulated devices trigger interrupts directly with qemu_set_irq. irqfds are currently only used by vhost afaik, since it's being interrupted externally, much like pass through devices are. Fair enough. Thanks for the clarification. Sort of. When the VFIO device triggers an interrupt, we get notified via the eventfd we've registered for that interrupt. We can then call qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. That much works today. Understood but performance wise this is no good for KVM right? The irqfd mechanism is simply a way for KVM to directly consume the eventfd and raise an interrupt via a pre-setup vector. That's yet to be implemented for INTx on VFIO, but should mostly be a matter of connecting existing pieces together. It's working for MSI-X. OK, I was on the impression you already had irqfd 'connected' to KVM from VFIO... This is why I was asking about the nature of the changed in VFIO. When VFIO sends an interrupt, it disables the physical device from generating more interrupts (this is where VFIO requires PCI 2.3 compliant devices for the INTx disable bit int he status register). When the guest services the interrupt, we can detect this by catching the EOI of the IOAPIC. At that point, we can re-eanble interrupts on the device. Wash, rinse, repeat. To do this in qemu, I created a callback on the ioapic where drivers can register for the interrupt they care about. Since KVM moves the ioapic into the kernel, we need to extend this into KVM and have yet another eventfd mechanism. It's possible that we could have the VFIO kernel module also receive this eventfd, re-enabling interrupts on the device, in much the same way as above. In the cases of KVM where are you going to catch the EIO? For some reason I'm on the impression that this is part of KVM. If so then how are you going to 'signal' to VFIO? Cannot use eventfd here right? Yes, none of this requires KVM specific modifications to VFIO. VFIO is still just triggering eventfds, and hopefully receiving one via an irqfd-like mechanism for EOI. Thanks for your reply. -Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, 2010-10-27 at 11:23 -0700, Etienne Martineau wrote: > On Wed, 27 Oct 2010, Alex Williamson wrote: > > I've actually been working on porting the qemu vfio driver over to > > qemu-kvm recently, and nearly have it working. For MSI-X interrupts > > I've ported to the common msix.c code, which abstracts how the interrupt > > is actually sent to the guest. I've also added irqfd support in the > > msix mask notifier so MSI-X interrupts avoid bouncing through qemu. MSI > > support should work the same once Michael's msi.c is upstream. > > > > For INTx interrupts, qemu_set_irq will also work with KVM (it has to or > > all of the emulated drivers would break). The problem is getting an EOI > > back from the KVM kernel apic. I'm currently working on code that adds > > a new KVM ioctl to register an eventfd for the EOI, which then triggers > > qemu-kvm to re-enable the interrupt. My hope is that we can add irqfd > > support to both of these paths, so INTx is injected directly from VFIO > > into KVM, and VFIO can directly consume the KVM EOI. > > OK let me try to understand what you've done (please correct me if I'm > wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery. No, emulated devices trigger interrupts directly with qemu_set_irq. irqfds are currently only used by vhost afaik, since it's being interrupted externally, much like pass through devices are. > Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the > assigned devices receive an IRQ it pass it directly to kvm without > bouncing to userspace? Sort of. When the VFIO device triggers an interrupt, we get notified via the eventfd we've registered for that interrupt. We can then call qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. That much works today. The irqfd mechanism is simply a way for KVM to directly consume the eventfd and raise an interrupt via a pre-setup vector. That's yet to be implemented for INTx on VFIO, but should mostly be a matter of connecting existing pieces together. It's working for MSI-X. > I'm not sure to understand the part where VFIO signal back the EIO to KVM? When VFIO sends an interrupt, it disables the physical device from generating more interrupts (this is where VFIO requires PCI 2.3 compliant devices for the INTx disable bit int he status register). When the guest services the interrupt, we can detect this by catching the EOI of the IOAPIC. At that point, we can re-eanble interrupts on the device. Wash, rinse, repeat. To do this in qemu, I created a callback on the ioapic where drivers can register for the interrupt they care about. Since KVM moves the ioapic into the kernel, we need to extend this into KVM and have yet another eventfd mechanism. It's possible that we could have the VFIO kernel module also receive this eventfd, re-enabling interrupts on the device, in much the same way as above. I haven't tried this yet, but it should just be a matter of creating another VFIO ioctl and stealing code from the KVM irqfd setup. > Also, with your change do you think that VFIO can be keept generic? > Reason I'm asking is because we are plannig to use VFIO for some userspace > drivers... Yes, none of this requires KVM specific modifications to VFIO. VFIO is still just triggering eventfds, and hopefully receiving one via an irqfd-like mechanism for EOI. > >> Because qemu device assignment is working on VFIO I'm making the > >> assumption that kvm iommu code can be entirely deprecated. Maybe I'm > >> totally wrong here? > > > > Yes, VFIO makes no use of it. > > Yes I'm wrong? VFIO does not make any use of the KVM iommu code, or any of the KVM device assignment ioctls for that matter. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Wed, 27 Oct 2010, Alex Williamson wrote: I've actually been working on porting the qemu vfio driver over to qemu-kvm recently, and nearly have it working. For MSI-X interrupts I've ported to the common msix.c code, which abstracts how the interrupt is actually sent to the guest. I've also added irqfd support in the msix mask notifier so MSI-X interrupts avoid bouncing through qemu. MSI support should work the same once Michael's msi.c is upstream. For INTx interrupts, qemu_set_irq will also work with KVM (it has to or all of the emulated drivers would break). The problem is getting an EOI back from the KVM kernel apic. I'm currently working on code that adds a new KVM ioctl to register an eventfd for the EOI, which then triggers qemu-kvm to re-enable the interrupt. My hope is that we can add irqfd support to both of these paths, so INTx is injected directly from VFIO into KVM, and VFIO can directly consume the KVM EOI. OK let me try to understand what you've done (please correct me if I'm wrong). Emulated devices relies on 'kvm_irqfd' for interrupts delivery. Somehow you've modify VFIO to understand 'kvm_irqfd' so that whenever the assigned devices receive an IRQ it pass it directly to kvm without bouncing to userspace? I'm not sure to understand the part where VFIO signal back the EIO to KVM? Also, with your change do you think that VFIO can be keept generic? Reason I'm asking is because we are plannig to use VFIO for some userspace drivers... Because qemu device assignment is working on VFIO I'm making the assumption that kvm iommu code can be entirely deprecated. Maybe I'm totally wrong here? Yes, VFIO makes no use of it. Yes I'm wrong? Thanks, -Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, 26 Oct 2010, Chris Wright wrote: I would totally agree with you if the alternative implementation to this legacy mode would be available in a relatively short time frame. I'm not sure about that? Depends on how quickly you can help whip it into shape ;) Humm, this is not an easy question; IMHO the biggest problem of going with VFIO for devices assignment is interruption handling. VFIO knows how to signal an eventfd but doesn't know about 'kvm_set_irq' for example. I think that VFIO will require some adaptation for that matter. Because qemu device assignment is working on VFIO I'm making the assumption that kvm iommu code can be entirely deprecated. Maybe I'm totally wrong here? That's why I asked about how you were implementing, for example, the AER extended capability exposure. Capabilities are a problem for the current code (forget about extended capabilities, just regular capabilities). We depend on Q35 PCIe chipset for that matter. Do this answer your question? thanks, Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
* Etienne Martineau (etmartin...@gmail.com) wrote: > On Tue, 26 Oct 2010, Chris Wright wrote: > >Right, and adding more to the existing KVM code which we are hoping to > >push to legacy support mode doesn't sound like a great idea. > > I would totally agree with you if the alternative implementation to > this legacy mode would be available in a relatively short time > frame. I'm not sure about that? Depends on how quickly you can help whip it into shape ;) That's why I asked about how you were implementing, for example, the AER extended capability exposure. Capabilities are a problem for the current code (forget about extended capabilities, just regular capabilities). > >>In that context, do you think it's acceptable for KVM to be the > >>driver of the assigned devices? OR should we simply add the AER > >>logic into existing pci-stub and relay the information to user-space > >>through eventfd... > > > >I'm reluctant to add logic to pci-stub, but VFIO should be able to > >handle this directly. > > I agree that VFIO should be able to do the job. It would be great to see some effort on this. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, 26 Oct 2010, Chris Wright wrote: Right, and adding more to the existing KVM code which we are hoping to push to legacy support mode doesn't sound like a great idea. I would totally agree with you if the alternative implementation to this legacy mode would be available in a relatively short time frame. I'm not sure about that? In that context, do you think it's acceptable for KVM to be the driver of the assigned devices? OR should we simply add the AER logic into existing pci-stub and relay the information to user-space through eventfd... I'm reluctant to add logic to pci-stub, but VFIO should be able to handle this directly. I agree that VFIO should be able to do the job. thanks, -Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, Oct 26, 2010 at 03:15:58PM -0700, Chris Wright wrote: > > I think we need to register with PCI and provide > > 'pci_error_handlers' callback if we wants to receive AER > > notification. > > Right, and adding more to the existing KVM code which we are hoping to > push to legacy support mode doesn't sound like a great idea. If it's a small patch, I think Avi mentioned that he might consider features too. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
* Etienne Martineau (etmartin...@gmail.com) wrote: > > > On Tue, 26 Oct 2010, Chris Wright wrote: > > >>***One of the aspect I'm not clear is the strategy for > >>device-assignment under KVM? > >>A) Move to VFIO; [/dev/iommu, /dev/vfio] > > > >Long term, hopefully VFIO > > > >>B) KVM as a driver for the assigned devices; [sysfs/ ioctls..] > > > >Short term (i.e. current qemu-kvm tree...this is what we have now and > >will continue to until plan A) gets more mature). > > > Strictly speaking, I don't really agree with 'B' being the current > implementation. Correct me if I'm wrong but for assigned devices, > kvm does a look up for the device and eventually obtain a handle to > it (struct pci_dev*) without doing a proper 'pci_register_driver'. OK, not a PCI driver per-se (that's pci-stub), but KVM owns the process of registering interrupt handler, programming iommu, etc. > I think we need to register with PCI and provide > 'pci_error_handlers' callback if we wants to receive AER > notification. Right, and adding more to the existing KVM code which we are hoping to push to legacy support mode doesn't sound like a great idea. > In that context, do you think it's acceptable for KVM to be the > driver of the assigned devices? OR should we simply add the AER > logic into existing pci-stub and relay the information to user-space > through eventfd... I'm reluctant to add logic to pci-stub, but VFIO should be able to handle this directly. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, 26 Oct 2010, Chris Wright wrote: ***One of the aspect I'm not clear is the strategy for device-assignment under KVM? A) Move to VFIO; [/dev/iommu, /dev/vfio] Long term, hopefully VFIO B) KVM as a driver for the assigned devices; [sysfs/ ioctls..] Short term (i.e. current qemu-kvm tree...this is what we have now and will continue to until plan A) gets more mature). Strictly speaking, I don't really agree with 'B' being the current implementation. Correct me if I'm wrong but for assigned devices, kvm does a look up for the device and eventually obtain a handle to it (struct pci_dev*) without doing a proper 'pci_register_driver'. I think we need to register with PCI and provide 'pci_error_handlers' callback if we wants to receive AER notification. In that context, do you think it's acceptable for KVM to be the driver of the assigned devices? OR should we simply add the AER logic into existing pci-stub and relay the information to user-space through eventfd... Thanks, Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, Oct 26, 2010 at 01:24:00PM -0700, Etienne Martineau wrote: > ***One of the aspect I'm not clear is the strategy for > device-assignment under KVM? > A) Move to VFIO; [/dev/iommu, /dev/vfio] > B) KVM as a driver for the assigned devices; [sysfs/ ioctls..] > C) No plan for short term > > Alex, Chris, Michael thanks for your reply. > -Etienne We will have to keep supporting B for a long while. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
* Etienne Martineau (etmartin...@gmail.com) wrote: > ***Background context: > We are using kvm on a x86 based platform. For performance reason, we > use devices assignment. The type of devices we have to deal with is > mostly custom ASICs but we also have standard stuff (ex 82599EB > 10-GigabitSR-IOV). > > AER handling in guest VM is very important to us. Most of our > drivers assume timely notification in cases of PCIe AER. Failure to > provide such support will result in very undesireable behavior at > the other end :) > > In our cases, AER is not strictly function of the hardware 'quality' > but also tied to the way the user can interact with the system (by > pulling a card on the fly for example). > > ***Q35, VFIO: > I'm aware of the PCIe Q35 chipset work that Isaku is working on. I > have a college that is working actively on merging the change into > our qemu-kvm branch. > > I'm also familiar with VFIO. I've seen the work from Alex on the > qemu VFIO front. Not too sure how this is going to work on qemu-kvm? > OR maybe it's because I'm confused about the long term plan for user > space [qemu vs qemu-kvm] branch? The Plan (TM) is for qemu-kvm to be just a staging branch for qemu. There are a few bits left in qemu-kvm (I'm sure you're aware since you're working with them ;) that aren't yet upstream. The device assignment code is one of those bits. Ideally, we could get KVM out of the business of being a device driver for assigned host devices and move all of that functionality to VFIO. And when looking at adding significant new functionality like AER, scoping VFIO would be very useful. > ***One of the aspect I'm not clear is the strategy for > device-assignment under KVM? > A) Move to VFIO; [/dev/iommu, /dev/vfio] Long term, hopefully VFIO > B) KVM as a driver for the assigned devices; [sysfs/ ioctls..] Short term (i.e. current qemu-kvm tree...this is what we have now and will continue to until plan A) gets more mature). thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
***Background context: We are using kvm on a x86 based platform. For performance reason, we use devices assignment. The type of devices we have to deal with is mostly custom ASICs but we also have standard stuff (ex 82599EB 10-GigabitSR-IOV). AER handling in guest VM is very important to us. Most of our drivers assume timely notification in cases of PCIe AER. Failure to provide such support will result in very undesireable behavior at the other end :) In our cases, AER is not strictly function of the hardware 'quality' but also tied to the way the user can interact with the system (by pulling a card on the fly for example). ***Q35, VFIO: I'm aware of the PCIe Q35 chipset work that Isaku is working on. I have a college that is working actively on merging the change into our qemu-kvm branch. I'm also familiar with VFIO. I've seen the work from Alex on the qemu VFIO front. Not too sure how this is going to work on qemu-kvm? OR maybe it's because I'm confused about the long term plan for user space [qemu vs qemu-kvm] branch? ***One of the aspect I'm not clear is the strategy for device-assignment under KVM? A) Move to VFIO; [/dev/iommu, /dev/vfio] B) KVM as a driver for the assigned devices; [sysfs/ ioctls..] C) No plan for short term Alex, Chris, Michael thanks for your reply. -Etienne On Tue, 26 Oct 2010, Michael S. Tsirkin wrote: On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote: Chris, Michael et al. Part of the project I'm working on, we are looking at extending the device assignment capabilities to provide support for PCIe AER. Ideally, the host would register for AER (for every assigned devices) and pass them up to Qemu. As of now, one of the problem is that KVM is not a driver for the assigned devices. I've seen Chris's slides from KVM conf 2010 but I haven't seen any patches or discussion on that topic... On another front, I've seen the work from Michael around 'uio_pci_generic' and some of his comments: " It's expected that more features of interest to virtualization will be added to this driver in the future. Possibilities are: mmap for device resources, MSI/MSI-X, eventfd (to interface with kvm), iommu." I think that extending 'uio_pci_generic' to support AER is relatively straight forward (assuming eventfd support from UIO). Any thought? thanks, -Etienne On the qemu side, patches for aer support are being worked on, pls take a look, likely there's some synergy there. On the kernel side, AER for uio sounds fine, but the features listed above are not yet supported, so you'd have to code them up too. There's a uiommu driver from Tom Lyon that does the iommu bit. Alex and Tom have been working on a VFIO driver which is not based on the uio framework. You can try joining forces. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote: > Chris, Michael et al. > > Part of the project I'm working on, we are looking at extending the > device assignment capabilities to provide support for PCIe AER. BTW, what's the motivation to support this feature? -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, Oct 26, 2010 at 09:41:12AM -0700, etmartin101 wrote: > Chris, Michael et al. > > Part of the project I'm working on, we are looking at extending the > device assignment capabilities to provide support for PCIe AER. > Ideally, the host would register for AER (for every assigned > devices) and pass them up to Qemu. > > As of now, one of the problem is that KVM is not a driver for the > assigned devices. I've seen Chris's slides from KVM conf 2010 but I > haven't seen any patches or discussion on that topic... > > On another front, I've seen the work from Michael around 'uio_pci_generic' > and some of his comments: > " It's expected that more features of interest to virtualization will be > added to this driver in the future. Possibilities are: mmap for device > resources, MSI/MSI-X, eventfd (to interface with kvm), iommu." > > I think that extending 'uio_pci_generic' to support AER is > relatively straight forward (assuming eventfd support from UIO). > > Any thought? > > thanks, > -Etienne On the qemu side, patches for aer support are being worked on, pls take a look, likely there's some synergy there. On the kernel side, AER for uio sounds fine, but the features listed above are not yet supported, so you'd have to code them up too. There's a uiommu driver from Tom Lyon that does the iommu bit. Alex and Tom have been working on a VFIO driver which is not based on the uio framework. You can try joining forces. > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
* etmartin101 (etmartin...@gmail.com) wrote: > Part of the project I'm working on, we are looking at extending the > device assignment capabilities to provide support for PCIe AER. How did you plan to do this? Right now we only provide PCI capabilities (not Extended Capabilities). > Ideally, the host would register for AER (for every assigned > devices) and pass them up to Qemu. This should be so for devices iff they support AER to begin with. Having proper chipset support in QEMU (PCIe + AER) also a step. > As of now, one of the problem is that KVM is not a driver for the > assigned devices. I've seen Chris's slides from KVM conf 2010 but I > haven't seen any patches or discussion on that topic... > > On another front, I've seen the work from Michael around 'uio_pci_generic' > and some of his comments: > " It's expected that more features of interest to virtualization will be > added to this driver in the future. Possibilities are: mmap for device > resources, MSI/MSI-X, eventfd (to interface with kvm), iommu." > > I think that extending 'uio_pci_generic' to support AER is > relatively straight forward (assuming eventfd support from UIO). Did you look at VFIO? thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM devices assignment; PCIe AER?
On Tue, Oct 26, 2010 at 10:41 AM, etmartin101 wrote: > Chris, Michael et al. > > Part of the project I'm working on, we are looking at extending the > device assignment capabilities to provide support for PCIe AER. > Ideally, the host would register for AER (for every assigned devices) and > pass them up to Qemu. > > As of now, one of the problem is that KVM is not a driver for the > assigned devices. I've seen Chris's slides from KVM conf 2010 but I > haven't seen any patches or discussion on that topic... > > On another front, I've seen the work from Michael around 'uio_pci_generic' > and some of his comments: > " It's expected that more features of interest to virtualization will be > added to this driver in the future. Possibilities are: mmap for device > resources, MSI/MSI-X, eventfd (to interface with kvm), iommu." > > I think that extending 'uio_pci_generic' to support AER is relatively > straight forward (assuming eventfd support from UIO). A lot of work has been going into a new "VFIO" driver that has taken over in the virtualization direction that uio_pci_generic was starting to go. Search the mailing list for it. That's probably where we'd want to focus AER efforts. I posted a qemu VFIO driver RFC a few months ago and I'm still actively working on it. We currently pass the device capabilities to the guest via an emulation layer in the VFIO kernel driver. This layer might be going away as we try to simplify the kernel side, which would mean more capability work in qemu. We also need the update to the Q35 chipset that Isaku is working on to be able to have an express root port to signal AER events. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM devices assignment; PCIe AER?
Chris, Michael et al. Part of the project I'm working on, we are looking at extending the device assignment capabilities to provide support for PCIe AER. Ideally, the host would register for AER (for every assigned devices) and pass them up to Qemu. As of now, one of the problem is that KVM is not a driver for the assigned devices. I've seen Chris's slides from KVM conf 2010 but I haven't seen any patches or discussion on that topic... On another front, I've seen the work from Michael around 'uio_pci_generic' and some of his comments: " It's expected that more features of interest to virtualization will be added to this driver in the future. Possibilities are: mmap for device resources, MSI/MSI-X, eventfd (to interface with kvm), iommu." I think that extending 'uio_pci_generic' to support AER is relatively straight forward (assuming eventfd support from UIO). Any thought? thanks, -Etienne -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html