Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression

Peter Xu Fri, 24 Aug 2018 02:13:16 -0700

On Fri, Aug 24, 2018 at 10:51:18AM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Aug 24, 2018 at 10:45 AM Peter Xu <pet...@redhat.com> wrote:
> >
> > On Fri, Aug 24, 2018 at 10:32:58AM +0200, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Fri, Aug 24, 2018 at 5:45 AM, Peter Xu <pet...@redhat.com> wrote:
> > > > On Thu, Aug 23, 2018 at 04:31:20PM +0200, Marc-André Lureau wrote:
> > > >> Hi,
> > > >>
> > > >> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > > >> (and its follow up 99f2f54174a59), Peter moved chardev socket
> > > >> connection to machine_done event. However, chardev created later will
> > > >> no longer attempt to connect, and chardev created in tests do not have
> > > >> machine_done event (breaking some of vhost-user-test).
> > > >>
> > > >> The goal was to move the "connect" source to the chardev frontend
> > > >> context (the monitor thread context in his case). chr->gcontext is set
> > > >> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > > >> function will be called in general, so we can't delay connection until
> > > >> then: the chardev should still attempt to connect during open(), using
> > > >> the main context.
> > > >>
> > > >> An alternative would be to specify the iothread during chardev
> > > >> creation. Setting up monitor OOB would be quite different too, it
> > > >> would take the same iothread as argument.
> > > >>
> > > >> 99f2f54174a595e is also a bit problematic, since it will behave
> > > >> differently before and after machine_done (the first case gives a
> > > >> chance to use a different context reliably, the second looks racy)
> > > >>
> > > >> In the end, I am not sure this is all necessary, as chardev callbacks
> > > >> are called after qemu_chr_fe_set_handlers(), at which point the
> > > >> context of sources are updated. In "char-socket: update all ioc
> > > >> handlers when changing context", I moved also the hup handler to the
> > > >> updated context. So unless the main thread is already stuck, we can
> > > >> setup a different context for the chardev at that time. Or not?
> > > >>
> > > >> v2:
> > > >> - fix a random socket chardev test failure
> > > >
> > > > I have no problem on patch 3-5, but aren't patch 1-2 still breaking
> > > > the context switch of chardev?  Or did I misunderstood?
> > >
> > > Switching context is broken/racy regardless. My current plan is to
> > > suspend the context to switch to, and update all sources before
> > > thawing the target context.
> > >
> > > If you prefer we can delay the revert, but I would rather have them
> > > because the regression seems much worse to me.
> >
> > Have you started working on above?  I started to code up some RFC
> > patch to let the chardev maintain the iothreads (now it'll only have a
> > monitor iothread) then we don't need to switch context for chardev any
> > more.  I plan to post it today for a quick look.  With that, this
> > series will not break anything then.
> 
> No, I haven't really started the code, but spent some time thinking...
> 
> Ok, if you manage to put together that proposal, that sounds interesting.


Not really your iothread proposal; I tried to keep the interface
unchanged, but move the iothread ownership from the monitor code to
chardev code.  Please have a look at:

  [RFC 0/3] chardev: introduce chardev contexts

> 
> Nevertheless, I can work on improving/fixing the race when switching
> contexts (already used by colo)

Ouch, is colo changing context too?  I hope that series won't break it.

(Ok if so I believe it'll very possible to break it...)

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v2 0/5] Fix socket chardev regression

Reply via email to