Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Michael S. Tsirkin
On Tue, Oct 16, 2012 at 10:49:38AM -0600, Alex Williamson wrote:
> On Tue, 2012-10-16 at 17:23 +0200, Michael S. Tsirkin wrote:
> > On Tue, Oct 16, 2012 at 09:13:15AM -0600, Alex Williamson wrote:
> > > > There's no chance we ship e.g. q35 by mistake without this API: since
> > > > there is no way this specific assert can be missed in even basic
> > > > testing:
> > > > 
> > > > So I see it differently:
> > > > 
> > > > As coded here:
> > > > chipset authors get lazy and do not implement API.
> > > > bad performance for all users.
> > > > 
> > > > With assert:
> > > > chipset authors implement necessary API.
> > > > good performance for all users.
> > > 
> > > I prefer a carrot, not a whip.  Thanks,
> > > 
> > > Alex
> > > 
> > 
> > It's not just that.
> > Problem is performance testing/fixing is hard.
> 
> Getting an error_report from the driver saying it's using a slow path
> and why makes that significantly easier.
> 
> > Catching and fixing asserts is easy.
> 
> Easy for who?  The user trying to test a feature?  Probably not.  Me,
> who may not have access to the chipset documentation or understand the
> platform?  Maybe, maybe not.
> 
> > So working around buggy qemu code really backfires
> > as it reverses the motivation for writing well performing
> > code. History proves me right: for each API change where
> > we implemented a fallback old code stayed around for years.
> 
> Does that necessarily mean it was wrong?  How many of those API changes
> added new features that may have been abandoned if the developer was
> required to make sweeping changes to get their code accepted?  If not
> abandoned, how much delayed?  How many land mines might we have in the
> code for changes that were done incorrectly or missed?  I don't
> understand why adding robustness to the API is such a contentious point,
> but it's your prerogative, just as it's mine to avoid using that API
> arbitrarily.  Thanks,
> 
> Alex

Yea. All I say is, I intend to fix things so you don't need
to probe for this API. I think it does not make sense to
add a temporary API as a stopgap for this since it will solve no
actual problem. If e.g. Jason finds it hard to add this to
q35, we could add a stopgap solution for vfio.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Alex Williamson
On Tue, 2012-10-16 at 17:23 +0200, Michael S. Tsirkin wrote:
> On Tue, Oct 16, 2012 at 09:13:15AM -0600, Alex Williamson wrote:
> > > There's no chance we ship e.g. q35 by mistake without this API: since
> > > there is no way this specific assert can be missed in even basic
> > > testing:
> > > 
> > > So I see it differently:
> > > 
> > > As coded here:
> > >   chipset authors get lazy and do not implement API.
> > >   bad performance for all users.
> > > 
> > > With assert:
> > >   chipset authors implement necessary API.
> > >   good performance for all users.
> > 
> > I prefer a carrot, not a whip.  Thanks,
> > 
> > Alex
> > 
> 
> It's not just that.
> Problem is performance testing/fixing is hard.

Getting an error_report from the driver saying it's using a slow path
and why makes that significantly easier.

> Catching and fixing asserts is easy.

Easy for who?  The user trying to test a feature?  Probably not.  Me,
who may not have access to the chipset documentation or understand the
platform?  Maybe, maybe not.

> So working around buggy qemu code really backfires
> as it reverses the motivation for writing well performing
> code. History proves me right: for each API change where
> we implemented a fallback old code stayed around for years.

Does that necessarily mean it was wrong?  How many of those API changes
added new features that may have been abandoned if the developer was
required to make sweeping changes to get their code accepted?  If not
abandoned, how much delayed?  How many land mines might we have in the
code for changes that were done incorrectly or missed?  I don't
understand why adding robustness to the API is such a contentious point,
but it's your prerogative, just as it's mine to avoid using that API
arbitrarily.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Michael S. Tsirkin
On Tue, Oct 16, 2012 at 09:13:15AM -0600, Alex Williamson wrote:
> > There's no chance we ship e.g. q35 by mistake without this API: since
> > there is no way this specific assert can be missed in even basic
> > testing:
> > 
> > So I see it differently:
> > 
> > As coded here:
> > chipset authors get lazy and do not implement API.
> > bad performance for all users.
> > 
> > With assert:
> > chipset authors implement necessary API.
> > good performance for all users.
> 
> I prefer a carrot, not a whip.  Thanks,
> 
> Alex
> 

It's not just that.
Problem is performance testing/fixing is hard.
Catching and fixing asserts is easy.
So working around buggy qemu code really backfires
as it reverses the motivation for writing well performing
code. History proves me right: for each API change where
we implemented a fallback old code stayed around for years.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Alex Williamson
On Tue, 2012-10-16 at 17:08 +0200, Michael S. Tsirkin wrote:
> On Tue, Oct 16, 2012 at 08:48:04AM -0600, Alex Williamson wrote:
> > On Tue, 2012-10-16 at 16:14 +0200, Michael S. Tsirkin wrote:
> > > On Tue, Oct 16, 2012 at 07:51:43AM -0600, Alex Williamson wrote:
> > > > On Tue, 2012-10-16 at 08:39 +0200, Michael S. Tsirkin wrote:
> > > > > On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> > > > > > This makes use of the new level irqfd support enabling bypass of
> > > > > > qemu userspace both on INTx injection and unmask.  This 
> > > > > > significantly
> > > > > > boosts the performance of devices making use of legacy interrupts.
> > > > > > 
> > > > > > Signed-off-by: Alex Williamson 
> > > > > > ---
> > > > > > 
> > > > > > My INTx routing workaround below will probably raise some eyebrows,
> > > > > > but I don't feel it's worth subjecting users to core dumps if they
> > > > > > want to try vfio-pci on new platforms.  INTx routing is part of some
> > > > > > larger plan, but until that plan materializes we have to try to 
> > > > > > avoid
> > > > > > the API unless we think there's a good chance it might be there.
> > > > > > I'll accept the maintenance of updating a whitelist in the interim.
> > > > > > Thanks,
> > > > > > 
> > > > > > Alex
> > > > > > 
> > > > > >  hw/vfio_pci.c |  224 
> > > > > > +
> > > > > >  1 file changed, 224 insertions(+)
> > > > > > 
> > > > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > > > > > index 639371e..777a5f8 100644
> > > > > > --- a/hw/vfio_pci.c
> > > > > > +++ b/hw/vfio_pci.c
> > > > > > @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice 
> > > > > > *pdev, uint32_t addr, int len);
> > > > > >  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
> > > > > >  
> > > > > >  /*
> > > > > > + * PCI code refuses to make it possible to probe whether the 
> > > > > > chipset
> > > > > > + * supports pci_device_route_intx_to_irq() and booby traps the call
> > > > > > + * to assert if doesn't.  For us, this is just an optimization, so
> > > > > > + * only enable it when we know it's present.  Unfortunately PCIBus 
> > > > > > is
> > > > > > + * private, so we can't just look at the function pointer.
> > > > > > + */
> > > > > > +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> > > > > > +{
> > > > > > +#ifdef CONFIG_KVM
> > > > > > +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> > > > > > +DeviceState *dev;
> > > > > > +
> > > > > > +if (!kvm_irqchip_in_kernel() ||
> > > > > > +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> > > > > > +   return false;
> > > > > > +}
> > > > > 
> > > > > 
> > > > > Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
> > > > > Also for KVM_IRQFD_FLAG_RESAMPLE.
> > > > 
> > > > I posted the patch for that separately yesterday.  I'll only request a
> > > > pull once that's in.
> > > 
> > > OK missed that. In the future, might be a good idea to note dependencies
> > > in the patchset or repost them as part of patchset with appropriate
> > > tagging.
> > > 
> > > > > > +
> > > > > > +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> > > > > > +
> > > > > > +dev = bus->parent;
> > > > > > +
> > > > > > +if (!strncmp("i440FX-pcihost", 
> > > > > > object_get_typename(OBJECT(dev)), 14)) {
> > > > > > +return true;
> > > > > > +}
> > > > > > +}
> > > > > > +
> > > > > > +error_report("vfio-pci: VM chipset does not support INTx 
> > > > > > routing, "
> > > > > > + "using slow INTx mode\n");
> > > > > 
> > > > > When does this code trigger? It seems irqchip implies piix ATM -
> > > > > is this just dead code?
> > > > 
> > > > Unused, but not unnecessary.  Another chipset is under development,
> > > > which means very quickly irqchip will not imply piix.
> > > 
> > > So this is for purposes of an out of tree stuff, let's
> > > keep these hacks out of tree too. My guess is
> > > q35 can just be added with pci_device_route_intx straight away.
> > > 
> > > >  Likewise irqfd
> > > > support is being added to other architectures, so I don't know how long
> > > > the kvm specific tests will hold up.  Testing for a specific chipset
> > > > could of course be avoided if we were willing to support:
> > > > 
> > > > bool pci_device_intx_route_supported(PCIDevice *pdev)
> > > > 
> > > > or the NOROUTE option I posted previously.
> > > 
> > > This is just moving the pain to users who will
> > > get bad performance and will have to debug it. Injecting
> > > through userspace is slow, new architectures should
> > > simply add irqfd straight away instead of adding
> > > work arounds in userspace.
> > > 
> > > This is exactly why we have the assert in pci core.
> > 
> > Let's compare user experience:
> > 
> > As coded here:
> > 
> > - Error message, only runs slower if the driver actually uses INTx.
> > Result: file bug,

Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Michael S. Tsirkin
On Tue, Oct 16, 2012 at 08:48:04AM -0600, Alex Williamson wrote:
> On Tue, 2012-10-16 at 16:14 +0200, Michael S. Tsirkin wrote:
> > On Tue, Oct 16, 2012 at 07:51:43AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-10-16 at 08:39 +0200, Michael S. Tsirkin wrote:
> > > > On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> > > > > This makes use of the new level irqfd support enabling bypass of
> > > > > qemu userspace both on INTx injection and unmask.  This significantly
> > > > > boosts the performance of devices making use of legacy interrupts.
> > > > > 
> > > > > Signed-off-by: Alex Williamson 
> > > > > ---
> > > > > 
> > > > > My INTx routing workaround below will probably raise some eyebrows,
> > > > > but I don't feel it's worth subjecting users to core dumps if they
> > > > > want to try vfio-pci on new platforms.  INTx routing is part of some
> > > > > larger plan, but until that plan materializes we have to try to avoid
> > > > > the API unless we think there's a good chance it might be there.
> > > > > I'll accept the maintenance of updating a whitelist in the interim.
> > > > > Thanks,
> > > > > 
> > > > > Alex
> > > > > 
> > > > >  hw/vfio_pci.c |  224 
> > > > > +
> > > > >  1 file changed, 224 insertions(+)
> > > > > 
> > > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > > > > index 639371e..777a5f8 100644
> > > > > --- a/hw/vfio_pci.c
> > > > > +++ b/hw/vfio_pci.c
> > > > > @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice 
> > > > > *pdev, uint32_t addr, int len);
> > > > >  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
> > > > >  
> > > > >  /*
> > > > > + * PCI code refuses to make it possible to probe whether the chipset
> > > > > + * supports pci_device_route_intx_to_irq() and booby traps the call
> > > > > + * to assert if doesn't.  For us, this is just an optimization, so
> > > > > + * only enable it when we know it's present.  Unfortunately PCIBus is
> > > > > + * private, so we can't just look at the function pointer.
> > > > > + */
> > > > > +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> > > > > +{
> > > > > +#ifdef CONFIG_KVM
> > > > > +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> > > > > +DeviceState *dev;
> > > > > +
> > > > > +if (!kvm_irqchip_in_kernel() ||
> > > > > +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> > > > > + return false;
> > > > > +}
> > > > 
> > > > 
> > > > Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
> > > > Also for KVM_IRQFD_FLAG_RESAMPLE.
> > > 
> > > I posted the patch for that separately yesterday.  I'll only request a
> > > pull once that's in.
> > 
> > OK missed that. In the future, might be a good idea to note dependencies
> > in the patchset or repost them as part of patchset with appropriate
> > tagging.
> > 
> > > > > +
> > > > > +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> > > > > +
> > > > > +dev = bus->parent;
> > > > > +
> > > > > +if (!strncmp("i440FX-pcihost", 
> > > > > object_get_typename(OBJECT(dev)), 14)) {
> > > > > +return true;
> > > > > +}
> > > > > +}
> > > > > +
> > > > > +error_report("vfio-pci: VM chipset does not support INTx 
> > > > > routing, "
> > > > > + "using slow INTx mode\n");
> > > > 
> > > > When does this code trigger? It seems irqchip implies piix ATM -
> > > > is this just dead code?
> > > 
> > > Unused, but not unnecessary.  Another chipset is under development,
> > > which means very quickly irqchip will not imply piix.
> > 
> > So this is for purposes of an out of tree stuff, let's
> > keep these hacks out of tree too. My guess is
> > q35 can just be added with pci_device_route_intx straight away.
> > 
> > >  Likewise irqfd
> > > support is being added to other architectures, so I don't know how long
> > > the kvm specific tests will hold up.  Testing for a specific chipset
> > > could of course be avoided if we were willing to support:
> > > 
> > > bool pci_device_intx_route_supported(PCIDevice *pdev)
> > > 
> > > or the NOROUTE option I posted previously.
> > 
> > This is just moving the pain to users who will
> > get bad performance and will have to debug it. Injecting
> > through userspace is slow, new architectures should
> > simply add irqfd straight away instead of adding
> > work arounds in userspace.
> > 
> > This is exactly why we have the assert in pci core.
> 
> Let's compare user experience:
> 
> As coded here:
> 
> - Error message, only runs slower if the driver actually uses INTx.
> Result: file bug, continue using vfio-pci, maybe do something useful,
> maybe find new bugs to file.
> 
> Blindly calling PCI code w/ assert:
> 
> - Assert.  Result: file bug, full stop.
> 
> Maybe I do too much kernel programming, but I don't take asserts lightly
> and prefer they be saved for "something is really broken and I can't
> safely conti

Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Alex Williamson
On Tue, 2012-10-16 at 16:14 +0200, Michael S. Tsirkin wrote:
> On Tue, Oct 16, 2012 at 07:51:43AM -0600, Alex Williamson wrote:
> > On Tue, 2012-10-16 at 08:39 +0200, Michael S. Tsirkin wrote:
> > > On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> > > > This makes use of the new level irqfd support enabling bypass of
> > > > qemu userspace both on INTx injection and unmask.  This significantly
> > > > boosts the performance of devices making use of legacy interrupts.
> > > > 
> > > > Signed-off-by: Alex Williamson 
> > > > ---
> > > > 
> > > > My INTx routing workaround below will probably raise some eyebrows,
> > > > but I don't feel it's worth subjecting users to core dumps if they
> > > > want to try vfio-pci on new platforms.  INTx routing is part of some
> > > > larger plan, but until that plan materializes we have to try to avoid
> > > > the API unless we think there's a good chance it might be there.
> > > > I'll accept the maintenance of updating a whitelist in the interim.
> > > > Thanks,
> > > > 
> > > > Alex
> > > > 
> > > >  hw/vfio_pci.c |  224 
> > > > +
> > > >  1 file changed, 224 insertions(+)
> > > > 
> > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > > > index 639371e..777a5f8 100644
> > > > --- a/hw/vfio_pci.c
> > > > +++ b/hw/vfio_pci.c
> > > > @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice 
> > > > *pdev, uint32_t addr, int len);
> > > >  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
> > > >  
> > > >  /*
> > > > + * PCI code refuses to make it possible to probe whether the chipset
> > > > + * supports pci_device_route_intx_to_irq() and booby traps the call
> > > > + * to assert if doesn't.  For us, this is just an optimization, so
> > > > + * only enable it when we know it's present.  Unfortunately PCIBus is
> > > > + * private, so we can't just look at the function pointer.
> > > > + */
> > > > +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> > > > +{
> > > > +#ifdef CONFIG_KVM
> > > > +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> > > > +DeviceState *dev;
> > > > +
> > > > +if (!kvm_irqchip_in_kernel() ||
> > > > +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> > > > +   return false;
> > > > +}
> > > 
> > > 
> > > Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
> > > Also for KVM_IRQFD_FLAG_RESAMPLE.
> > 
> > I posted the patch for that separately yesterday.  I'll only request a
> > pull once that's in.
> 
> OK missed that. In the future, might be a good idea to note dependencies
> in the patchset or repost them as part of patchset with appropriate
> tagging.
> 
> > > > +
> > > > +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> > > > +
> > > > +dev = bus->parent;
> > > > +
> > > > +if (!strncmp("i440FX-pcihost", 
> > > > object_get_typename(OBJECT(dev)), 14)) {
> > > > +return true;
> > > > +}
> > > > +}
> > > > +
> > > > +error_report("vfio-pci: VM chipset does not support INTx routing, "
> > > > + "using slow INTx mode\n");
> > > 
> > > When does this code trigger? It seems irqchip implies piix ATM -
> > > is this just dead code?
> > 
> > Unused, but not unnecessary.  Another chipset is under development,
> > which means very quickly irqchip will not imply piix.
> 
> So this is for purposes of an out of tree stuff, let's
> keep these hacks out of tree too. My guess is
> q35 can just be added with pci_device_route_intx straight away.
> 
> >  Likewise irqfd
> > support is being added to other architectures, so I don't know how long
> > the kvm specific tests will hold up.  Testing for a specific chipset
> > could of course be avoided if we were willing to support:
> > 
> > bool pci_device_intx_route_supported(PCIDevice *pdev)
> > 
> > or the NOROUTE option I posted previously.
> 
> This is just moving the pain to users who will
> get bad performance and will have to debug it. Injecting
> through userspace is slow, new architectures should
> simply add irqfd straight away instead of adding
> work arounds in userspace.
> 
> This is exactly why we have the assert in pci core.

Let's compare user experience:

As coded here:

- Error message, only runs slower if the driver actually uses INTx.
Result: file bug, continue using vfio-pci, maybe do something useful,
maybe find new bugs to file.

Blindly calling PCI code w/ assert:

- Assert.  Result: file bug, full stop.

Maybe I do too much kernel programming, but I don't take asserts lightly
and prefer they be saved for "something is really broken and I can't
safely continue".  This is not such a case.

> > > > +#endif
> > > > +return false;
> > > > +}
> > > > +
> > > > +static PCIINTxRoute vfio_pci_device_route_intx_to_irq(PCIDevice *pdev, 
> > > > int pin)
> > > > +{
> > > > +if (!vfio_pci_bus_has_intx_route(pdev)) {
> > > > +return (PCIINTxRoute) { .mode = 

Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Michael S. Tsirkin
On Tue, Oct 16, 2012 at 07:51:43AM -0600, Alex Williamson wrote:
> On Tue, 2012-10-16 at 08:39 +0200, Michael S. Tsirkin wrote:
> > On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> > > This makes use of the new level irqfd support enabling bypass of
> > > qemu userspace both on INTx injection and unmask.  This significantly
> > > boosts the performance of devices making use of legacy interrupts.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > ---
> > > 
> > > My INTx routing workaround below will probably raise some eyebrows,
> > > but I don't feel it's worth subjecting users to core dumps if they
> > > want to try vfio-pci on new platforms.  INTx routing is part of some
> > > larger plan, but until that plan materializes we have to try to avoid
> > > the API unless we think there's a good chance it might be there.
> > > I'll accept the maintenance of updating a whitelist in the interim.
> > > Thanks,
> > > 
> > > Alex
> > > 
> > >  hw/vfio_pci.c |  224 
> > > +
> > >  1 file changed, 224 insertions(+)
> > > 
> > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > > index 639371e..777a5f8 100644
> > > --- a/hw/vfio_pci.c
> > > +++ b/hw/vfio_pci.c
> > > @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice 
> > > *pdev, uint32_t addr, int len);
> > >  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
> > >  
> > >  /*
> > > + * PCI code refuses to make it possible to probe whether the chipset
> > > + * supports pci_device_route_intx_to_irq() and booby traps the call
> > > + * to assert if doesn't.  For us, this is just an optimization, so
> > > + * only enable it when we know it's present.  Unfortunately PCIBus is
> > > + * private, so we can't just look at the function pointer.
> > > + */
> > > +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> > > +{
> > > +#ifdef CONFIG_KVM
> > > +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> > > +DeviceState *dev;
> > > +
> > > +if (!kvm_irqchip_in_kernel() ||
> > > +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> > > + return false;
> > > +}
> > 
> > 
> > Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
> > Also for KVM_IRQFD_FLAG_RESAMPLE.
> 
> I posted the patch for that separately yesterday.  I'll only request a
> pull once that's in.

OK missed that. In the future, might be a good idea to note dependencies
in the patchset or repost them as part of patchset with appropriate
tagging.

> > > +
> > > +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> > > +
> > > +dev = bus->parent;
> > > +
> > > +if (!strncmp("i440FX-pcihost", object_get_typename(OBJECT(dev)), 
> > > 14)) {
> > > +return true;
> > > +}
> > > +}
> > > +
> > > +error_report("vfio-pci: VM chipset does not support INTx routing, "
> > > + "using slow INTx mode\n");
> > 
> > When does this code trigger? It seems irqchip implies piix ATM -
> > is this just dead code?
> 
> Unused, but not unnecessary.  Another chipset is under development,
> which means very quickly irqchip will not imply piix.

So this is for purposes of an out of tree stuff, let's
keep these hacks out of tree too. My guess is
q35 can just be added with pci_device_route_intx straight away.

>  Likewise irqfd
> support is being added to other architectures, so I don't know how long
> the kvm specific tests will hold up.  Testing for a specific chipset
> could of course be avoided if we were willing to support:
> 
> bool pci_device_intx_route_supported(PCIDevice *pdev)
> 
> or the NOROUTE option I posted previously.

This is just moving the pain to users who will
get bad performance and will have to debug it. Injecting
through userspace is slow, new architectures should
simply add irqfd straight away instead of adding
work arounds in userspace.

This is exactly why we have the assert in pci core.

> > > +#endif
> > > +return false;
> > > +}
> > > +
> > > +static PCIINTxRoute vfio_pci_device_route_intx_to_irq(PCIDevice *pdev, 
> > > int pin)
> > > +{
> > > +if (!vfio_pci_bus_has_intx_route(pdev)) {
> > > +return (PCIINTxRoute) { .mode = PCI_INTX_DISABLED, .irq = -1 };
> > > +}
> > > +
> > > +return pci_device_route_intx_to_irq(pdev, pin);
> > > +}
> > > +
> > > +static bool vfio_pci_intx_route_changed(PCIINTxRoute *old, PCIINTxRoute 
> > > *new)
> > > +{
> > > +return old->mode != new->mode || old->irq != new->irq;
> > > +}
> > > +
> > 
> > Didn't you add an API for this? It's on pci branch but I can drop
> > it if not needed.
> 
> I did and I'll switch to it when available, but I have no idea when that
> will be, so I've hedged my bets by re-implementing it here.  2 week+
> turnover for a patch makes it difficult to coordinate dependent changes
> on short qemu release cycles.

It's available on pci branch, please base on that instead of master.
Yes I merge at about 2 week inter

Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-16 Thread Alex Williamson
On Tue, 2012-10-16 at 08:39 +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> > This makes use of the new level irqfd support enabling bypass of
> > qemu userspace both on INTx injection and unmask.  This significantly
> > boosts the performance of devices making use of legacy interrupts.
> > 
> > Signed-off-by: Alex Williamson 
> > ---
> > 
> > My INTx routing workaround below will probably raise some eyebrows,
> > but I don't feel it's worth subjecting users to core dumps if they
> > want to try vfio-pci on new platforms.  INTx routing is part of some
> > larger plan, but until that plan materializes we have to try to avoid
> > the API unless we think there's a good chance it might be there.
> > I'll accept the maintenance of updating a whitelist in the interim.
> > Thanks,
> > 
> > Alex
> > 
> >  hw/vfio_pci.c |  224 
> > +
> >  1 file changed, 224 insertions(+)
> > 
> > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> > index 639371e..777a5f8 100644
> > --- a/hw/vfio_pci.c
> > +++ b/hw/vfio_pci.c
> > @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
> > uint32_t addr, int len);
> >  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
> >  
> >  /*
> > + * PCI code refuses to make it possible to probe whether the chipset
> > + * supports pci_device_route_intx_to_irq() and booby traps the call
> > + * to assert if doesn't.  For us, this is just an optimization, so
> > + * only enable it when we know it's present.  Unfortunately PCIBus is
> > + * private, so we can't just look at the function pointer.
> > + */
> > +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> > +{
> > +#ifdef CONFIG_KVM
> > +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> > +DeviceState *dev;
> > +
> > +if (!kvm_irqchip_in_kernel() ||
> > +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> > +   return false;
> > +}
> 
> 
> Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
> Also for KVM_IRQFD_FLAG_RESAMPLE.

I posted the patch for that separately yesterday.  I'll only request a
pull once that's in.

> > +
> > +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> > +
> > +dev = bus->parent;
> > +
> > +if (!strncmp("i440FX-pcihost", object_get_typename(OBJECT(dev)), 
> > 14)) {
> > +return true;
> > +}
> > +}
> > +
> > +error_report("vfio-pci: VM chipset does not support INTx routing, "
> > + "using slow INTx mode\n");
> 
> When does this code trigger? It seems irqchip implies piix ATM -
> is this just dead code?

Unused, but not unnecessary.  Another chipset is under development,
which means very quickly irqchip will not imply piix.  Likewise irqfd
support is being added to other architectures, so I don't know how long
the kvm specific tests will hold up.  Testing for a specific chipset
could of course be avoided if we were willing to support:

bool pci_device_intx_route_supported(PCIDevice *pdev)

or the NOROUTE option I posted previously.

> > +#endif
> > +return false;
> > +}
> > +
> > +static PCIINTxRoute vfio_pci_device_route_intx_to_irq(PCIDevice *pdev, int 
> > pin)
> > +{
> > +if (!vfio_pci_bus_has_intx_route(pdev)) {
> > +return (PCIINTxRoute) { .mode = PCI_INTX_DISABLED, .irq = -1 };
> > +}
> > +
> > +return pci_device_route_intx_to_irq(pdev, pin);
> > +}
> > +
> > +static bool vfio_pci_intx_route_changed(PCIINTxRoute *old, PCIINTxRoute 
> > *new)
> > +{
> > +return old->mode != new->mode || old->irq != new->irq;
> > +}
> > +
> 
> Didn't you add an API for this? It's on pci branch but I can drop
> it if not needed.

I did and I'll switch to it when available, but I have no idea when that
will be, so I've hedged my bets by re-implementing it here.  2 week+
turnover for a patch makes it difficult to coordinate dependent changes
on short qemu release cycles.

> > +/*
> >   * Common VFIO interrupt disable
> >   */
> >  static void vfio_disable_irqindex(VFIODevice *vdev, int index)
> > @@ -185,6 +232,21 @@ static void vfio_unmask_intx(VFIODevice *vdev)
> >  ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
> >  }
> >  
> > +#ifdef CONFIG_KVM
> > +static void vfio_mask_intx(VFIODevice *vdev)
> > +{
> > +struct vfio_irq_set irq_set = {
> > +.argsz = sizeof(irq_set),
> > +.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
> > +.index = VFIO_PCI_INTX_IRQ_INDEX,
> > +.start = 0,
> > +.count = 1,
> > +};
> > +
> > +ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
> > +}
> > +#endif
> > +
> >  /*
> >   * Disabling BAR mmaping can be slow, but toggling it around INTx can
> >   * also be a huge overhead.  We try to get the best of both worlds by
> > @@ -248,6 +310,161 @@ static void vfio_eoi(VFIODevice *vdev)
> >  vfio_unmask_intx(vdev);
> >  }
> >  
> > +static void vfio_e

Re: [PATCH] vfio-pci: Add KVM INTx acceleration

2012-10-15 Thread Michael S. Tsirkin
On Mon, Oct 15, 2012 at 02:28:15PM -0600, Alex Williamson wrote:
> This makes use of the new level irqfd support enabling bypass of
> qemu userspace both on INTx injection and unmask.  This significantly
> boosts the performance of devices making use of legacy interrupts.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
> My INTx routing workaround below will probably raise some eyebrows,
> but I don't feel it's worth subjecting users to core dumps if they
> want to try vfio-pci on new platforms.  INTx routing is part of some
> larger plan, but until that plan materializes we have to try to avoid
> the API unless we think there's a good chance it might be there.
> I'll accept the maintenance of updating a whitelist in the interim.
> Thanks,
> 
> Alex
> 
>  hw/vfio_pci.c |  224 
> +
>  1 file changed, 224 insertions(+)
> 
> diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
> index 639371e..777a5f8 100644
> --- a/hw/vfio_pci.c
> +++ b/hw/vfio_pci.c
> @@ -154,6 +154,53 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
> uint32_t addr, int len);
>  static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
>  
>  /*
> + * PCI code refuses to make it possible to probe whether the chipset
> + * supports pci_device_route_intx_to_irq() and booby traps the call
> + * to assert if doesn't.  For us, this is just an optimization, so
> + * only enable it when we know it's present.  Unfortunately PCIBus is
> + * private, so we can't just look at the function pointer.
> + */
> +static bool vfio_pci_bus_has_intx_route(PCIDevice *pdev)
> +{
> +#ifdef CONFIG_KVM
> +BusState *bus = qdev_get_parent_bus(&pdev->qdev);
> +DeviceState *dev;
> +
> +if (!kvm_irqchip_in_kernel() ||
> +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> + return false;
> +}


Shouldn't we update linux-headers/ to get KVM_CAP_IRQFD_RESAMPLE?
Also for KVM_IRQFD_FLAG_RESAMPLE.

> +
> +for (; bus->parent; bus = qdev_get_parent_bus(dev)) {
> +
> +dev = bus->parent;
> +
> +if (!strncmp("i440FX-pcihost", object_get_typename(OBJECT(dev)), 
> 14)) {
> +return true;
> +}
> +}
> +
> +error_report("vfio-pci: VM chipset does not support INTx routing, "
> + "using slow INTx mode\n");

When does this code trigger? It seems irqchip implies piix ATM -
is this just dead code?


> +#endif
> +return false;
> +}
> +
> +static PCIINTxRoute vfio_pci_device_route_intx_to_irq(PCIDevice *pdev, int 
> pin)
> +{
> +if (!vfio_pci_bus_has_intx_route(pdev)) {
> +return (PCIINTxRoute) { .mode = PCI_INTX_DISABLED, .irq = -1 };
> +}
> +
> +return pci_device_route_intx_to_irq(pdev, pin);
> +}
> +
> +static bool vfio_pci_intx_route_changed(PCIINTxRoute *old, PCIINTxRoute *new)
> +{
> +return old->mode != new->mode || old->irq != new->irq;
> +}
> +

Didn't you add an API for this? It's on pci branch but I can drop
it if not needed.

> +/*
>   * Common VFIO interrupt disable
>   */
>  static void vfio_disable_irqindex(VFIODevice *vdev, int index)
> @@ -185,6 +232,21 @@ static void vfio_unmask_intx(VFIODevice *vdev)
>  ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
>  }
>  
> +#ifdef CONFIG_KVM
> +static void vfio_mask_intx(VFIODevice *vdev)
> +{
> +struct vfio_irq_set irq_set = {
> +.argsz = sizeof(irq_set),
> +.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
> +.index = VFIO_PCI_INTX_IRQ_INDEX,
> +.start = 0,
> +.count = 1,
> +};
> +
> +ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
> +}
> +#endif
> +
>  /*
>   * Disabling BAR mmaping can be slow, but toggling it around INTx can
>   * also be a huge overhead.  We try to get the best of both worlds by
> @@ -248,6 +310,161 @@ static void vfio_eoi(VFIODevice *vdev)
>  vfio_unmask_intx(vdev);
>  }
>  
> +static void vfio_enable_intx_kvm(VFIODevice *vdev)
> +{
> +#ifdef CONFIG_KVM
> +struct kvm_irqfd irqfd = {
> +.fd = event_notifier_get_fd(&vdev->intx.interrupt),
> +.gsi = vdev->intx.route.irq,
> +.flags = KVM_IRQFD_FLAG_RESAMPLE,


Should not kvm ioctl handling be localized in kvm-all.c?
E.g. extend kvm_irqchip_add_irqfd_notifier in
some way? Same question for KVM_CAP_IRQFD_RESAMPLE use above ...


> +};
> +struct vfio_irq_set *irq_set;
> +int ret, argsz;
> +int32_t *pfd;
> +
> +if (!kvm_irqchip_in_kernel() ||
> +vdev->intx.route.mode != PCI_INTX_ENABLED ||
> +!kvm_check_extension(kvm_state, KVM_CAP_IRQFD_RESAMPLE)) {
> +return;
> +}
> +
> +/* Get to a known interrupt state */
> +qemu_set_fd_handler(irqfd.fd, NULL, NULL, vdev);
> +vfio_mask_intx(vdev);
> +vdev->intx.pending = false;
> +qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0);
> +
> +/* Get an eventfd for resample/unmask */
> +if (event_notifier_init(&vdev->intx.unmask, 0)) {
> +error_report("vfio: Error: event_