On Wed, Jul 18, 2018 at 05:08:21PM +0200, Markus Armbruster wrote: > Marc-André, one question for you inline. Search for your name. > > Peter Xu <pet...@redhat.com> writes: > > > On Thu, Jun 28, 2018 at 03:20:41PM +0200, Markus Armbruster wrote: > >> Peter Xu <pet...@redhat.com> writes: > >> > >> > On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote: > >> >> Monitor behavior changes even when the client rejects capability "oob". > >> >> > >> >> Traditionally, the monitor reads, executes and responds to one command > >> >> after the other. If the client sends in-band commands faster than the > >> >> server can execute them, the kernel will eventually refuse to buffer > >> >> more, and sending blocks or fails with EAGAIN. > >> >> > >> >> To make OOB possible, we need to read and queue commands as we receive > >> >> them. If the client sends in-band commands faster than the server can > >> >> execute them, the server will eventually drop commands to limit the > >> >> queue length. The sever sends event COMMAND_DROPPED then. > >> >> > >> >> However, we get the new behavior even when the client rejects capability > >> >> "oob". We get the traditional behavior only when the server doesn't > >> >> offer "oob". > >> >> > >> >> Is this what we want? > >> > > >> > Not really. But I thought we kept that old behavior, haven't we? > >> > > >> > In handle_qmp_command() we have this: > >> > > >> > /* > >> > * If OOB is not enabled on the current monitor, we'll emulate the > >> > * old behavior that we won't process the current monitor any more > >> > * until it has responded. This helps make sure that as long as > >> > * OOB is not enabled, the server will never drop any command. > >> > */ > >> > if (!qmp_oob_enabled(mon)) { > >> > monitor_suspend(mon); > >> > req_obj->need_resume = true; > >> > } else { > >> > /* Drop the request if queue is full. */ > >> > if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) { > >> > qemu_mutex_unlock(&mon->qmp.qmp_queue_lock); > >> > qapi_event_send_command_dropped(id, > >> > > >> > COMMAND_DROP_REASON_QUEUE_FULL, > >> > &error_abort); > >> > qmp_request_free(req_obj); > >> > return; > >> > } > >> > } > >> > > >> > Here assuming that we are not enabling OOB, then since we will suspend > >> > the monitor when we receive one command, then monitor_can_read() later > >> > will check with a result that "we should not read the chardev", then > >> > it'll leave all the data in the input buffer. Then AFAIU the QMP > >> > client that didn't enable OOB will have similar behavior as before. > >> > > >> > Also, we will never receive a COMMAND_DROPPED event in that legacy > >> > mode as well since it's in the "else" block. Am I right? > >> > >> Hmm. Did I get confused? Let me look again. > >> > >> Some places check qmp_oob_enabled(), which is true when the server > >> offered capability "oob" and the client accepted. In terms of my reply > >> to Daniel, it distinguishes the two run time cases "OOB off" and "OOB > >> on". > > > > Exactly. > > > >> > >> Other places check ->use_io_thr, which is true when > >> > >> (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr) > >> > >> in monitor_init(). > > ->use_io_thr is now spelled ->use_io_thread. > > >> One of these places is get_qmp_greeting(). It makes the server offer > >> "oob" when ->use_io_thr. Makes sense. > >> > >> In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB > >> not offered" and ("OOB offered, but rejected by client" or "OOB offered > >> and accepted"). > > > > Exactly. > > > > To be more clear, let's name the three cases (as you mentioned): > > > > (A) OOB not offered > > (B) OOB offered, but rejected by client > > (C) OOB offered, and accepted > > > > Then: > > > > (1) qmp_oob_enabled() will only be return true if (C). > > (2) ->use_io_thr will only be true if (B) or (C). > > Notes on (A) to be kept in mind for the remainder of the discussion: > > * I don't expect (A) to occur in production > > (A) means the monitor has a chardev-mux backend. chardev-mux > multiplexes multiple character backends, e.g. multiplex a monitor and > a console on stdio. C-a c switches focus. While such multiplexing > may be convenient for humans, it's an unnecessary complication for > machines, which could just as well use two UNIX domain sockets and not > have to worry about focus and escape sequences. > > * I do expect (A) to go away eventually > > (A) exists only because "not all the chardev frontends can run without > main thread, or can run in multiple threads" [PATCH 6]. Hopefully, > that'll be fixed eventually, so (A) can go away. > > Marc-André, your "[PATCH 04/12] Revert "qmp: isolate responses into io > thread" states the "chardev write path is thread safe". What's > missing to make chardev-mux capable of supporting OOB? > > >> Uses could to violate 'may use "server offered OOB" only for > >> configuration purposes', so let's check them: > >> > >> * monitor_json_emitter() > > This is now qmp_queue_response() > > >> If mon->use_io_thr, push to response queue. Else emit directly. > >> > >> I'm afraid run time behavior differs for "OOB not offered" (emit > >> directly) and "OOB offered by rejected" (queue). > > > > Yeah it's different, but logically it's the same. IMHO it's only a > > fast path for main thread. In other word, I think it can still work > > correctly if we remove that else clause. In that sense, it's not > > really a "true" behavior change. > > Remember that (A) should not occur in production, and should go away > eventually. Maintaining a fast path just for that feels unjustified. > > >> * qmp_caps_check() > >> > >> If !mon->use_io_thr, reject client's acceptance of "oob". Implicitly > >> recomputing the set of offered capabilities here is less than elegant, > >> but it's not wrong. > > This has been replaced by monitor_qmp_caps_reset(). > > If mon->use_io_thread, offer OOB. Makes sense. > > >> * monitor_resume() > >> > >> If mon->use_io_thr(), kick the monitor I/O thread. > >> > >> Again, different behavior for the two variations of "OOB off". > > > > This is another example like above - IMHO we can just kick it no > > matter what, but we just don't need to do that for main thread (since > > we are in main thread so we are sure it is not asleep). It's just > > another trivial enhancement on top of the main logic. > > Again, maintaining a special case just for (A) feels unjustified. > > >> * get_qmp_greeting() > >> > >> Covered above, looks good to me. > > Also replaced by monitor_qmp_caps_reset(). > > >> * monitor_qmp_setup_handlers_bh() > >> > >> If mon->use_io_thr(), pass the monitor I/O thread's AIOContext to > >> qemu_chr_fe_set_handlers(), else pass NULL. > >> > >> Again, different behavior for the two variations of "OOB off". > > > > This is different indeed, but IMHO it's not really "behavior > > difference", it's just the context (thread to run the code) that is > > different. The major code path for case (A) and (B) (which are the > > two "off" cases) should be the same (when I say "major", I excluded > > those trivial enhancements that I mentioned). > > As far as I can tell, monitor_qmp_setup_handlers_bh() can run only if > scheduled by monitor_init() when mon->use_io_thread. If that's true, > then the !mon->use_io_thread case is unreachable. Ah, I see you've also > noticed that, and you're proposing a patch below.
Yeah, and... > > >> * monitor_init() > >> > >> If mon->use_io_thr, set up bottom half > >> monitor_qmp_setup_handlers_bh(), else call qemu_chr_fe_set_handlers() > >> directly. > > > > This is indeed a bit tricky (to avoid a potential race that Stefan has > > pointed out), however after things are setup it'll be exactly the same > > code path for both OFF cases. > > > > And actually when I read the code I noticed that we can actually > > simplify the code with this: > > > > diff --git a/monitor.c b/monitor.c > > index 0730a27172..5d6bff8d51 100644 > > --- a/monitor.c > > +++ b/monitor.c > > @@ -4635,18 +4635,14 @@ static void monitor_qmp_setup_handlers_bh(void > > *opaque) > > Monitor *mon = opaque; > > GMainContext *context; > > > > - if (mon->use_io_thr) { > > - /* > > - * When use_io_thr is set, we use the global shared dedicated > > - * IO thread for this monitor to handle input/output. > > - */ > > - context = monitor_get_io_context(); > > - /* We should have inited globals before reaching here. */ > > - assert(context); > > - } else { > > - /* The default main loop, which is the main thread */ > > - context = NULL; > > - } > > + assert(mon->use_io_thr); > > + /* > > + * When use_io_thr is set, we use the global shared dedicated > > + * IO thread for this monitor to handle input/output. > > + */ > > + context = monitor_get_io_context(); > > + /* We should have inited globals before reaching here. */ > > + assert(context); > > > > qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_qmp_read, > > monitor_qmp_event, NULL, mon, context, true); > > > > Since monitor_qmp_setup_handlers_bh() will only be used for IOThread > > case. I can post a patch for that. > > Yes, please. ... actually I have had that patch in my "turn-oob-on-default" series, and even with your Reviewed-by. :) https://patchwork.kernel.org/patch/10506173/ I suppose I'll just repost it and the series after the release. > > >> > >> Again, different behavior for the two variations of "OOB off". > > The difference is > > * timing: right away for (A) vs. bottom half for (B) and (C) > > * context argument for qemu_chr_fe_set_handlers(): NULL, i.e. main loop > thread for (A), monitor_get_io_context(), i.e. the I/O thread for (B) > and (C) > > Okay, I think. > > >> Either I am confused now, or the two variations of "OOB off" execute > >> different code at run time. Which is it? > > > > As mentioned, they should be running the same code at run time. > > Though with some trivial differences, but if you see above most of > > them should only be small enhancements which can actually be removed. > > Please do. > > >> If it's different code, are the differences observable for the client? > > > > AFAIU since the code path should mostly be the same for the two OOF > > cases, IMHO there should have no difference observable from the > > client, otherwise it's buggy and I'd be willing to fix it. > > > >> > >> Observable or not, I suspect the differences should not exist. > > > > We can remove those "trivial enhancements" if we want to make sure the > > code paths are exactly the same, but I'm not sure whether that's > > really what we need (or we just need to make sure it's unobservable). > > As long as we're running separate code for (A), "unobservable > difference" is a proposition we need to prove. > > Reducing separate code should simplify the proof. > > Eliminiating it will simplify it maximally :) Ok, this I can try to do too after the release. > > > Thanks! > > Thank *you* for persevering :) I should thank you for your continuous support on out-of-band (or even before it was proposed and named by you :). (I hope I didn't miss anything that I should comment on; let me know if I have) Regards, -- Peter Xu