Re: [RFC PATCH 1/6] kvm: add device control API

Gleb Natapov Wed, 20 Feb 2013 05:53:36 -0800

On Tue, Feb 19, 2013 at 03:16:37PM -0600, Scott Wood wrote:
> On 02/19/2013 06:24:18 AM, Gleb Natapov wrote:
> >On Mon, Feb 18, 2013 at 05:01:40PM -0600, Scott Wood wrote:
> >> The ability to set/get attributes is needed.  Sorry, but "get or set
> >> one blob of data, up to 512 bytes, for the entire irqchip" is just
> >> not good enough -- assuming you don't want us to start sticking
> >> pointers and commands in *that* data. :-)
> >>
> >Proposed interface sticks pointers into ioctl data, so why doing
> >the same
> >for KVM_SET_IRQCHIP/KVM_GET_IRQCHIP makes you smile.
> 
> There's a difference between putting a pointer in an ioctl control
> structure that is specifically documented as being that way (as in
> ONE_REG), versus taking an ioctl that claims to be setting/getting a
> blob of state and embedding pointers in it.  It would be like
> sticking a pointer in the attribute payload of this API, which I
> think is something to be discouraged.
If documentation is what differentiate for you between silly and smart
then write documentation instead of new interfaces.


KVM_SET_IRQCHIP/KVM_GET_IRQCHIP is defined to operate on blob of data on
x86, nothing prevent you from adding MPIC specifics to the interface,
Add mpic state into kvm_irqchip structure and if 512 bytes is not enough
for you to transfer the state put pointers there and _document_ them.
But with 512 bytes you can transfer properties inline, so you probably
do not need pointer there anyway. I see you have three properties 2 of
them 32bit and one 64bit.

>                                        It'd also be using
> KVM_SET_IRQCHIP to read data, which is the sort of thing you object
> to later on regarding KVM_IRQ_LINE_STATUS.
> 
Do not see why.

> Then there's the silliness of transporting 512 bytes just to read a
> descriptor for transporting something else.
> 
Yes, agree. But is this enough of a reason to introduce entirely new
interface? Is it on performance critical path? Doubt it, unless you
abuse the interface to send interrupts, but then isn't it silty to
do copy_from_user() twice to inject an interrupt like proposed interface
does?

> >For signaling irqs (I think this is what you mean by "commands")
> >we have KVM_IRQ_LINE.
> 
> It's one type of command.  Another is setting the address.  Another
> is writing to registers that have side effects (this is how MSI
> injection is done on MPIC, just as in real hardware).
> 
Setting the address is setting an attribute. Sending MSI is a command.
Things you set/get during init/migration are attributes. Things you do
to cause side-effects are commands.

> What is the benefit of KVM_IRQ_LINE over what MPIC does?  What real
> (non-glue/wrapper) code can become common?
> 
No new ioctl with exactly same result (well actually even faster since
less copying is done). You need to show us the benefits of the new interface
vs existing one, not vice versa.

> And I really hope you don't want us to do MSIs the x86 way.
> 
What is wrong with KVM_SIGNAL_MSI? Except that you'll need code to glue it
to mpic.

> In the XICS thread, Paul brought up the possibliity of cascaded
> MPICs.  It's not relevant to the systems we're trying to model, but
> if one did want to use the in-kernel irqchip interface for that, it
> would be really nice to be able to operate on a specific MPIC for
> injection rather than have to come up with some sort of global
> identifier (above and beyond the minor flattening we'd need to do to
> represent a single MPIC's interrupts in a flat numberspace).
> 
ARM encodes information in irq field of KVM_IRQ_LINE like that:
  bits:  | 31 ... 24 | 23  ... 16 | 15    ...     0 |
  field: | irq_type  | vcpu_index |   irq_number    |
Why will similar approach not work?

> >> If you mean the way to inject interrupts, it's simpler this way.
> >> Why go out of our way to inject common glue code into a
> >> communication path between hw/kvm/mpic.c in QEMU and
> >> arch/powerpc/kvm/mpic.c in KVM?  Or rather, why make that common
> >> glue be specific to this one function when we could reuse the same
> >> communication glue used for other things, such as device state?
> >You will need glue anyway and I do no see how amount of it is much
> >different one way or the other.
> 
> It uses glue that we need to be present for other things anyway.  If
> it weren't for XICS we wouldn't need a KVM_IRQ_LINE implementation
> at all on PPC.  It may not be a major difference, but it doesn't
> affect anything but MPIC and it seems more straightforward this way.
> 
We are talking about something like 4 lines of userspace code including
bracket. I do not think this is strong point in favor of different
interface.

> >Gluing qemu_set_irq() to
> >ioctl(KVM_IRQ_LINE) or ioctl(KVM_SET_DEVICE_ATTR) is not much
> >different.
> 
> qemu_set_irq() is not glued to either of those.  It's glued to
> kvm_openpic_set_irq(), kvm_ioapic_set_irq(), etc.  It's already not
> generic code.
> 
OK, this does not invalidates my argument though.

> >Of course, since the interface you propose is not irq chip specific
> 
> This part of it is.
> 
> >we need non irq chip specific way to talk to it. But how do you
> >propose
> >to model things like KVM_IRQ_LINE_STATUS with KVM_SET_DEVICE_ATTR?
> 
> That one's not even in api.txt, so could you explain what exactly
> it's supposed to return, and why it's needed?
True. We need to add it. Your guess below is correct.

> 
> AFAICT, the only thing it gets used for in QEMU is coalescing
> mc146818rtc interrupts.
> 
At present yes. Still need to be supportable.

> Could an error return be used for cases where the IRQ was not
> delivered, in the very unlikely event that we want to implement
> something similar on MPIC?
We can, but I do not think it will be good API. This condition is not an
error.

>                            Note again that MPIC's decision to use
> or not use KVM_IRQ_LINE is only about what MPIC does; it is not
> inherent in the device control API.
That's the crux of the problem though. MPIC tries to be different just
for the sake to be different. Why? The only explanation you provide is
because current API is "silly", not that you cannot implement MPIC with
it or it will be unnecessary slow, just "silly".

> 
> >KVM_SET_DEVICE_ATTR needs to return data back and getting data
> >back from
> >"set" ioctl is strange.
> 
> If we really need a single atomic operation to both read and write,
> beyond returning error values, then yes, that would be a new ioctl.
> It could be added in the future if needed.
> 
> >Other devices may get other commands that need
> >response, so if we design generic interface we should take it into
> >account. I think using KVM_SET_DEVICE_ATTR to inject interrupts is a
> >misnomer, you do not set internal device attribute, you toggle
> >external
> >input. My be another ioctl KVM_SEND_DEVICE_COMMAND is needed.
> 
> I see no need for a separate ioctl in terms of the underlying
> infrastructure for distinguishing "attribute" from "write-only
> command".  I'm open to improvements on what the ioctl is called.
> It's basically like setting a register on a device, except I was
> concerned that if we actually called it a "register" that people
> would take it too literally and think it's only for the architected
> register state of the emulated device.
I agree "attribute" is better name than "register", but injecting
interrupt is not setting an attribute.

> 
> >> >ARM vGIC code, that is ready to go upstream, uses old way too. So
> >> >it will
> >> >be 2 archs against one.
> >>
> >> I wasn't aware that that's how it worked. :-P
> >>
> >What worked? That vGIC uses existing interface or that non generic
> >interface used by many arches wins generic one used by only one arch?
> 
> The latter.  Two wrongs don't make a right, and adding another
> inextensible, device-specific API is not the answer to the existing
> APIs being too inextensible and device/arch-specific.  Some portion
> will always need to be device-specific because we're controlling the
> creation and of a specific device, but the glue does not need to be.
>
This is not "adding another inextensible, device-specific API" vs "adding
cool generic extensible API" though. It is "using existing inextensible,
device-specific API" vs "adding cool generic extensible API".

> >APIs are easy to add and impossible to remove.
> 
> That's why I want to get it right this time.
> 
And what if you'll fail? What if next architecture will bring new
developer that will proclaim your new interface "silly" since it does not
allow for device destruction and do not return file descriptor for newly
created device that userspace can do select on to wait for a device's
events or mmap memory for fast userspace/device communication?

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/6] kvm: add device control API

Reply via email to