On Fri, Aug 24, 2018 at 10:51:18AM +0200, Marc-André Lureau wrote: > Hi > > On Fri, Aug 24, 2018 at 10:45 AM Peter Xu <pet...@redhat.com> wrote: > > > > On Fri, Aug 24, 2018 at 10:32:58AM +0200, Marc-André Lureau wrote: > > > Hi > > > > > > On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <pet...@redhat.com> wrote: > > > > On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote: > > > >> Hi, > > > >> > > > >> In commit 25679e5d58e "chardev: tcp: postpone async connection setup" > > > >> (and its follow up 99f2f54174a59), Peter moved chardev socket > > > >> connection to machine_done event. However, chardev created later will > > > >> no longer attempt to connect, and chardev created in tests do not have > > > >> machine_done event (breaking some of vhost-user-test). > > > >> > > > >> The goal was to move the "connect" source to the chardev frontend > > > >> context (the monitor thread context in his case). chr->gcontext is set > > > >> with qemu_chr_fe_set_handlers(). But there is no guarantee that the > > > >> function will be called in general, so we can't delay connection until > > > >> then: the chardev should still attempt to connect during open(), using > > > >> the main context. > > > >> > > > >> An alternative would be to specify the iothread during chardev > > > >> creation. Setting up monitor OOB would be quite different too, it > > > >> would take the same iothread as argument. > > > >> > > > >> 99f2f54174a595e is also a bit problematic, since it will behave > > > >> differently before and after machine_done (the first case gives a > > > >> chance to use a different context reliably, the second looks racy) > > > >> > > > >> In the end, I am not sure this is all necessary, as chardev callbacks > > > >> are called after qemu_chr_fe_set_handlers(), at which point the > > > >> context of sources are updated. In "char-socket: update all ioc > > > >> handlers when changing context", I moved also the hup handler to the > > > >> updated context. So unless the main thread is already stuck, we can > > > >> setup a different context for the chardev at that time. Or not? > > > >> > > > >> v2: > > > >> - fix a random socket chardev test failure > > > > > > > > I have no problem on patch 3-5, but aren't patch 1-2 still breaking > > > > the context switch of chardev? Or did I misunderstood? > > > > > > Switching context is broken/racy regardless. My current plan is to > > > suspend the context to switch to, and update all sources before > > > thawing the target context. > > > > > > If you prefer we can delay the revert, but I would rather have them > > > because the regression seems much worse to me. > > > > Have you started working on above? I started to code up some RFC > > patch to let the chardev maintain the iothreads (now it'll only have a > > monitor iothread) then we don't need to switch context for chardev any > > more. I plan to post it today for a quick look. With that, this > > series will not break anything then. > > No, I haven't really started the code, but spent some time thinking... > > Ok, if you manage to put together that proposal, that sounds interesting.
Not really your iothread proposal; I tried to keep the interface unchanged, but move the iothread ownership from the monitor code to chardev code. Please have a look at: [RFC 0/3] chardev: introduce chardev contexts > > Nevertheless, I can work on improving/fixing the race when switching > contexts (already used by colo) Ouch, is colo changing context too? I hope that series won't break it. (Ok if so I believe it'll very possible to break it...) Regards, -- Peter Xu