Peter Xu <pet...@redhat.com> writes:

> On Mon, Sep 03, 2018 at 03:41:16PM +0100, Daniel P. Berrangé wrote:
>> On Mon, Sep 03, 2018 at 09:30:52AM -0500, Eric Blake wrote:
>> > On 09/03/2018 08:31 AM, Markus Armbruster wrote:
>> > 
>> > > Example:
>> > > 
>> > >      client sends in-band command #1
>> > >      QEMU reads and queues
>> > >      QEMU dequeues in-band command #1
>> > >      in-band command #1 starts executing, but it's slooow
>> > >      client sends in-band command #2
>> > >      QEMU reads and queues
>> > >      ...
>> > >      client sends in-band command #8
>> > >      QEMU reads, queues and suspends the monitor
>> > >      client sends out-of-band command
>> > > --> time passes...
>> > >      in-band command #1 completes, QEMU sends reply
>> > >      QEMU dequeues in-band command #2, resumes the monitor
>> > >      in-band command #2 starts executing
>> > >      QEMU reads and executes out-of-band command
>> > >      out-of-band command completes, QEMU sends reply
>> > >      in-band command #2 completes, QEMU sends reply
>> > >      ... same for remaining in-band commands ...
>> > > 
>> > > The out-of-band command gets stuck behind the in-band commands.
>
> (It's a shame of me to have just noticed that the out-of-band command
>  will be stuck after we dropped the COMMAND_DROP event... so now I
>  agree it's not that ideal any more to drop the event but maybe still
>  preferable)

We can queue without limit, we can drop commands, or we can suspend
reading.  Each of these has drawbacks:

* Queuing without limit is simple for the client, but unsafe for QEMU.

* Dropping commands requires the client to cope with dropped commands.
  As currently designed, it's just as unsafe for QEMU: instead of
  queuing commands without limit, we get to queue their COMMAND_DROPPED
  events without limit.  A better design could avoid this flaw.

* Suspending reading requires the client to manage the flow of commands
  if it wants to keep the monitor available for out-of-band commands.

We decided that clients having to manage the flow of commands is no
worse than clients having to cope with dropped commands.  There's still
time to challenge this decision.

This series acts upon the decision: it switches from dropping commands
to suspending reading.  Makes the input direction safe.  The output
direction remains as unsafe as it's always been.  Fixing that is left
for later.

>> > > 
>> > > The client can avoid this by managing the flow of in-band commands: have
>> > > no more than 7 in flight, so the monitor never gets suspended.
>> > > 
>> > > This is a potentially useful thing to do for clients, isn't it?
>> > > 
>> > > Eric, Daniel, is it something libvirt would do?
>> > 
>> > Right now, libvirt serializes commands - it never sends a QMP command until
>> > the previous command's response has been processed. But that may not help
>> > much, since libvirt does not send OOB commands.
>> 
>> Note that is not merely due to the QMP monitor restriction either.
>> 
>> Libvirt serializes all its public APIs that can change state of a running
>> domain.  It usually aims to allow read-only APIs to be run in parallel with
>> APIs that change state.
>> 
>> The exception to the rule right now are some of the migration APIs which
>> we allow to be invoked to manage the migration process. 
>> 
>> > I guess when we are designing what libvirt should do, and deciding WHEN it
>> > should send OOB commands, we have the luxury of designing libvirt to 
>> > enforce
>> > how many in-flight in-band commands it will ever have pending at once
>> > (whether the current 'at most 1', or even if we make it more parallel to 
>> > 'at
>> > most 7'), so that we can still be ensured that the OOB command will be
>> > processed without being stuck in the queue of suspended in-band commands.
>> > If we never send more than one in-band at a time, then it's not a concern
>> > how deep the qemu queue is; but if we do want libvirt to start parallel
>> > in-band commands, then you are right that having a way to learn the qemu
>> > queue depth is programmatically more precise than just guessing the maximum
>> > depth.  But it's also hard to argue we need that complexity if we don't 
>> > have
>> > an immediate use envisioned for it.
>> 
>> In terms of what libvirt would want to parallelize, I think it is reasonable
>> to consider any of the query-XXXX commands desirable. Other stuff is likely
>> to remain serialized from libvirt's side.
>
> IMHO concurrency won't help much now even for query commands, since
> our current concurrency is still "partly" - the executions of query
> commands (which is the most time consuming part) will still be done
> sequentially, so even if we send multiple query commands in parallel
> (without waiting for a response of any sent commands), the total time
> used for the list of commands would be mostly the same.

Yes.  We execute all in-band commands (regardless of their monitor) in
the main thread.  Out-of-band commands can execute in @mon_iothread,
which provides a modest degree of concurrency.

> My understanding for why we have such a queue length now is that it
> came from a security concern: after we have a queue, we need that
> queue length to limit the memory usages for the QMP server.  Though
> that might not help much for real users like Libvirt, it's majorly
> serving as a way to protect QEMU QMP from being attacked or from being
> turned down by a buggy QMP client.

Yes.

QEMU has to trust its QMP clients, so malice is not a concern, but
accidents are.  Robust software does not buffer without bounds.

> But I agree now that the queue length information might still be
> helpful some day.  Maybe, we can hide that until we support executing
> commands in parallel for some of them.

Queue length can become interesting long before we get general
concurrency.

If you use QMP only synchronously (send command #1; receive reply #1;
send command #2; ...), then out-of-band does exactly nothing for you.
To make use of it, you have to send an out-of-band command *before* you
receive the previous command's reply.  That's a form of pipelining.

Note there's still no general concurrency.  There's a bit of pipelining,
and there's a bit of concurrency between one in-band command (executing
in main thread) and out-of-band command (executing in @mon_iothread).

Since we need to support a bit of pipelining anyway, why not support it
more generally?  All it takes it raising the queue length limit above
the minimum required for the use of OOB I just sketched.

Note that "since we need to support a bit of concurrency anyway, why not
support it more generally?" would be ludicrously naive :)

Reply via email to