Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-22 Thread Peter Xu
On Wed, Aug 22, 2018 at 11:46:32AM +0800, Peter Xu wrote:
> On Tue, Aug 21, 2018 at 04:16:27PM +0200, Paolo Bonzini wrote:
> > On 21/08/2018 16:04, Marc-André Lureau wrote:
> > >> If you don't like the way I proposed, another thing I am
> > >> thinking is that whether we can assign the gcontext for the chardev
> > >> backend before initialization of it (or by parsing the backend &
> > >> frontend relationships before init of backends), then we assure that
> > >> we never change the gcontext of any chardev backends.  Though that
> > > Yes, I think that's a cleaner solution. I suggested to use an iothread
> > > argument in the cover letter.
> > 
> > That would be nice, but isn't it already too late for the monitor chardev?
> 
> I think, yes, if we want to do this automatically.

Sorry I forgot to mention some more details here.  If to do it
automatically, IMHO we can just split the mon_init_func() into parsing
and init, then we do:

(1) parse monitors, setup gcontext/... for chardev
(2) init chardevs with the gcontext/... setup
(3) init monitors

Benefit is that we don't need to introduce new interface then.

> Though if as
> Marc-André suggested (which I didn't really notice first when reading
> the cover letter), then maybe that's not a problem since user need to
> manually specify the iothread for a chardev backend, then chardev
> context will not depend on monitor code any more.
> 
> Marc-André, do you want to propose your iothread interface?  That
> should be the easy way AFAIU, though that'll make the command line for
> monitor out-of-band much longer (but it seems fine at least to me).
> 
> Adding Markus too.
> 
> > 
> > In any case, I don't see a reason to dislike this patch, especially
> > since it comes with a testcase.
> 
> AFAIU the test case didn't really test the non-NULL gcontext case, so
> it fixed A (vhost-user reconnect) however it might break B (non-NULL
> gcontext with a potential race).
> 
> Regards,
> 
> -- 
> Peter Xu

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-21 Thread Peter Xu
On Tue, Aug 21, 2018 at 04:16:27PM +0200, Paolo Bonzini wrote:
> On 21/08/2018 16:04, Marc-André Lureau wrote:
> >> If you don't like the way I proposed, another thing I am
> >> thinking is that whether we can assign the gcontext for the chardev
> >> backend before initialization of it (or by parsing the backend &
> >> frontend relationships before init of backends), then we assure that
> >> we never change the gcontext of any chardev backends.  Though that
> > Yes, I think that's a cleaner solution. I suggested to use an iothread
> > argument in the cover letter.
> 
> That would be nice, but isn't it already too late for the monitor chardev?

I think, yes, if we want to do this automatically. Though if as
Marc-André suggested (which I didn't really notice first when reading
the cover letter), then maybe that's not a problem since user need to
manually specify the iothread for a chardev backend, then chardev
context will not depend on monitor code any more.

Marc-André, do you want to propose your iothread interface?  That
should be the easy way AFAIU, though that'll make the command line for
monitor out-of-band much longer (but it seems fine at least to me).

Adding Markus too.

> 
> In any case, I don't see a reason to dislike this patch, especially
> since it comes with a testcase.

AFAIU the test case didn't really test the non-NULL gcontext case, so
it fixed A (vhost-user reconnect) however it might break B (non-NULL
gcontext with a potential race).

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-21 Thread Daniel P . Berrangé
On Tue, Aug 21, 2018 at 04:04:45PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Aug 21, 2018 at 8:29 AM Peter Xu  wrote:
> > I fully agree that current way is not ideal since basically the
> > backend should not depend on the frontend, but now we have the
> > gcontext as an exception then the backend will somehow depend on the
> > frontend.  If you don't like the way I proposed, another thing I am
> > thinking is that whether we can assign the gcontext for the chardev
> > backend before initialization of it (or by parsing the backend &
> > frontend relationships before init of backends), then we assure that
> > we never change the gcontext of any chardev backends.  Though that
> 
> Yes, I think that's a cleaner solution. I suggested to use an iothread
> argument in the cover letter.
> 
> Paolo, Daniel, any opinion?

Never changing the GContext once initialized is much nicer. I think I
had suggested that before when this code was first proposed.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-21 Thread Paolo Bonzini
On 21/08/2018 16:04, Marc-André Lureau wrote:
>> If you don't like the way I proposed, another thing I am
>> thinking is that whether we can assign the gcontext for the chardev
>> backend before initialization of it (or by parsing the backend &
>> frontend relationships before init of backends), then we assure that
>> we never change the gcontext of any chardev backends.  Though that
> Yes, I think that's a cleaner solution. I suggested to use an iothread
> argument in the cover letter.

That would be nice, but isn't it already too late for the monitor chardev?

In any case, I don't see a reason to dislike this patch, especially
since it comes with a testcase.

Paolo



Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-21 Thread Marc-André Lureau
Hi

On Tue, Aug 21, 2018 at 8:29 AM Peter Xu  wrote:
>
> On Mon, Aug 20, 2018 at 05:37:55PM +0200, Marc-André Lureau wrote:
> > Hi
> >
> > On Mon, Aug 20, 2018 at 4:48 AM Peter Xu  wrote:
> > >
> > > On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote:
> > > > Hi,
> > > >
> > > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > > > (and its follow up 99f2f54174a59), Peter moved chardev socket
> > > > connection to machine_done event. However, chardev created later will
> > > > no longer attempt to connect, and chardev created in tests do not have
> > > > machine_done event (breaking some of vhost-user-test).
> > > >
> > > > The goal was to move the "connect" source to the chardev frontend
> > > > context (the monitor thread context in his case). chr->gcontext is set
> > > > with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > > > function will be called in general,
> > >
> > > Could you hint a case where we didn't use qemu_chr_fe_set_handlers()
> > > upon a chardev backend?  I thought it was always used in chardev
> > > frontends, and what the backend could do if without a frontend?
> >
> > Well, you don't have to have a front-end to have side effects. Connect
> > will be attempted even without frontend. We may have users expecting
> > that behaviour, that might be considered a break if we change it.
> >
> > (and unlikely, there might be frontends that are write only)
>
> My understanding is that qemu_chr_fe_set_handlers() is not only for
> port read, but also for the rest.  For example, we need to pass in the
> correct IOEventHandler* to handle chardev backend events even if the
> frontend only writes.
>
> >
> > >
> > > [1]
> > >
> > > > so we can't delay connection until
> > > > then: the chardev should still attempt to connect during open(), using
> > > > the main context.
> > > >
> > > > An alternative would be to specify the iothread during chardev
> > > > creation. Setting up monitor OOB would be quite different too, it
> > > > would take the same iothread as argument.
> > > >
> > > > 99f2f54174a595e is also a bit problematic, since it will behave
> > > > differently before and after machine_done (the first case gives a
> > > > chance to use a different context reliably, the second looks racy)
> > > >
> > > > In the end, I am not sure this is all necessary, as chardev callbacks
> > > > are called after qemu_chr_fe_set_handlers(), at which point the
> > > > context of sources are updated. In "char-socket: update all ioc
> > > > handlers when changing context", I moved also the hup handler to the
> > > > updated context. So unless the main thread is already stuck, we can
> > > > setup a different context for the chardev at that time. Or not?
> > >
> > > IMHO the two patches that you reverted are special-cases for reasons.
> > >
> > > The TLS handshake is carried out with an TLS internal GSource which is
> > > not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't
> > > update that GSource (please refer to qio_channel_tls_handshake_task).
> >
> > What can go wrong by using the default context for initial connection
> > and TLS handshake?
> >
> > Presumably, you have a case where the mainloop is no longer processed
> > and that will hang the chardev?
>
> Yeah I don't see a big problem now, but I'm not sure. Actually it
> should not be very hard to even migrate this one just like other
> GSources, however the async one below should be a bit harder.
>
> >
> > > The async connection is carried out in a standalone thread that calls
> > > connect().  IMHO we'd better not update the gcontext bound to the
> > > async task since otherwise there'll be a race (IIRC I proposed
> > > something before using a mutex to update the gcontext, but Dan would
> > > prefer not to, and I followed with the suggestion which makes sense to
> > > me).
> > >
> > > Could we just postpone these machine done tasks into
> > > qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook,
> > > just like what I mentioned in the other thread)?  Though we'll be sure
> > > qemu_chr_fe_set_handlers() will be called for all chardev backends
> > > hence I asked question [1] above.
> >
> > I would rather not to, if possible. unless we take the risk of
> > breaking current behaviour and review chardev usage in qemu.
>
> Yeah, I'd be glad to know any of the behavior breakage if there is,
> but I can't figure any out.  AFAIU there should be none since we
> should always be pairing a backend with a frontend.

Even if we check all usage of chardev in internal qemu, there might be
external users that expect that creating a chardev will attempt the
connection immediately.

>
> I fully agree that current way is not ideal since basically the
> backend should not depend on the frontend, but now we have the
> gcontext as an exception then the backend will somehow depend on the
> frontend.  If you don't like the way I proposed, another thing I am
> thinking is that whether we 

Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-21 Thread Peter Xu
On Mon, Aug 20, 2018 at 05:37:55PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Aug 20, 2018 at 4:48 AM Peter Xu  wrote:
> >
> > On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote:
> > > Hi,
> > >
> > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > > (and its follow up 99f2f54174a59), Peter moved chardev socket
> > > connection to machine_done event. However, chardev created later will
> > > no longer attempt to connect, and chardev created in tests do not have
> > > machine_done event (breaking some of vhost-user-test).
> > >
> > > The goal was to move the "connect" source to the chardev frontend
> > > context (the monitor thread context in his case). chr->gcontext is set
> > > with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > > function will be called in general,
> >
> > Could you hint a case where we didn't use qemu_chr_fe_set_handlers()
> > upon a chardev backend?  I thought it was always used in chardev
> > frontends, and what the backend could do if without a frontend?
> 
> Well, you don't have to have a front-end to have side effects. Connect
> will be attempted even without frontend. We may have users expecting
> that behaviour, that might be considered a break if we change it.
> 
> (and unlikely, there might be frontends that are write only)

My understanding is that qemu_chr_fe_set_handlers() is not only for
port read, but also for the rest.  For example, we need to pass in the
correct IOEventHandler* to handle chardev backend events even if the
frontend only writes.

> 
> >
> > [1]
> >
> > > so we can't delay connection until
> > > then: the chardev should still attempt to connect during open(), using
> > > the main context.
> > >
> > > An alternative would be to specify the iothread during chardev
> > > creation. Setting up monitor OOB would be quite different too, it
> > > would take the same iothread as argument.
> > >
> > > 99f2f54174a595e is also a bit problematic, since it will behave
> > > differently before and after machine_done (the first case gives a
> > > chance to use a different context reliably, the second looks racy)
> > >
> > > In the end, I am not sure this is all necessary, as chardev callbacks
> > > are called after qemu_chr_fe_set_handlers(), at which point the
> > > context of sources are updated. In "char-socket: update all ioc
> > > handlers when changing context", I moved also the hup handler to the
> > > updated context. So unless the main thread is already stuck, we can
> > > setup a different context for the chardev at that time. Or not?
> >
> > IMHO the two patches that you reverted are special-cases for reasons.
> >
> > The TLS handshake is carried out with an TLS internal GSource which is
> > not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't
> > update that GSource (please refer to qio_channel_tls_handshake_task).
> 
> What can go wrong by using the default context for initial connection
> and TLS handshake?
> 
> Presumably, you have a case where the mainloop is no longer processed
> and that will hang the chardev?

Yeah I don't see a big problem now, but I'm not sure. Actually it
should not be very hard to even migrate this one just like other
GSources, however the async one below should be a bit harder.

> 
> > The async connection is carried out in a standalone thread that calls
> > connect().  IMHO we'd better not update the gcontext bound to the
> > async task since otherwise there'll be a race (IIRC I proposed
> > something before using a mutex to update the gcontext, but Dan would
> > prefer not to, and I followed with the suggestion which makes sense to
> > me).
> >
> > Could we just postpone these machine done tasks into
> > qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook,
> > just like what I mentioned in the other thread)?  Though we'll be sure
> > qemu_chr_fe_set_handlers() will be called for all chardev backends
> > hence I asked question [1] above.
> 
> I would rather not to, if possible. unless we take the risk of
> breaking current behaviour and review chardev usage in qemu.

Yeah, I'd be glad to know any of the behavior breakage if there is,
but I can't figure any out.  AFAIU there should be none since we
should always be pairing a backend with a frontend.

I fully agree that current way is not ideal since basically the
backend should not depend on the frontend, but now we have the
gcontext as an exception then the backend will somehow depend on the
frontend.  If you don't like the way I proposed, another thing I am
thinking is that whether we can assign the gcontext for the chardev
backend before initialization of it (or by parsing the backend &
frontend relationships before init of backends), then we assure that
we never change the gcontext of any chardev backends.  Though that
will require that we need to setup all possible gcontexts before hand
(e.g., the monitor gcontext).  Then we can drop all these dynamic
binding magics (but just 

Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-20 Thread Marc-André Lureau
Hi

On Mon, Aug 20, 2018 at 4:48 AM Peter Xu  wrote:
>
> On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote:
> > Hi,
> >
> > In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> > (and its follow up 99f2f54174a59), Peter moved chardev socket
> > connection to machine_done event. However, chardev created later will
> > no longer attempt to connect, and chardev created in tests do not have
> > machine_done event (breaking some of vhost-user-test).
> >
> > The goal was to move the "connect" source to the chardev frontend
> > context (the monitor thread context in his case). chr->gcontext is set
> > with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> > function will be called in general,
>
> Could you hint a case where we didn't use qemu_chr_fe_set_handlers()
> upon a chardev backend?  I thought it was always used in chardev
> frontends, and what the backend could do if without a frontend?

Well, you don't have to have a front-end to have side effects. Connect
will be attempted even without frontend. We may have users expecting
that behaviour, that might be considered a break if we change it.

(and unlikely, there might be frontends that are write only)

>
> [1]
>
> > so we can't delay connection until
> > then: the chardev should still attempt to connect during open(), using
> > the main context.
> >
> > An alternative would be to specify the iothread during chardev
> > creation. Setting up monitor OOB would be quite different too, it
> > would take the same iothread as argument.
> >
> > 99f2f54174a595e is also a bit problematic, since it will behave
> > differently before and after machine_done (the first case gives a
> > chance to use a different context reliably, the second looks racy)
> >
> > In the end, I am not sure this is all necessary, as chardev callbacks
> > are called after qemu_chr_fe_set_handlers(), at which point the
> > context of sources are updated. In "char-socket: update all ioc
> > handlers when changing context", I moved also the hup handler to the
> > updated context. So unless the main thread is already stuck, we can
> > setup a different context for the chardev at that time. Or not?
>
> IMHO the two patches that you reverted are special-cases for reasons.
>
> The TLS handshake is carried out with an TLS internal GSource which is
> not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't
> update that GSource (please refer to qio_channel_tls_handshake_task).

What can go wrong by using the default context for initial connection
and TLS handshake?

Presumably, you have a case where the mainloop is no longer processed
and that will hang the chardev?

> The async connection is carried out in a standalone thread that calls
> connect().  IMHO we'd better not update the gcontext bound to the
> async task since otherwise there'll be a race (IIRC I proposed
> something before using a mutex to update the gcontext, but Dan would
> prefer not to, and I followed with the suggestion which makes sense to
> me).
>
> Could we just postpone these machine done tasks into
> qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook,
> just like what I mentioned in the other thread)?  Though we'll be sure
> qemu_chr_fe_set_handlers() will be called for all chardev backends
> hence I asked question [1] above.

I would rather not to, if possible. unless we take the risk of
breaking current behaviour and review chardev usage in qemu.


-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-19 Thread Peter Xu
On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote:
> Hi,
> 
> In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
> (and its follow up 99f2f54174a59), Peter moved chardev socket
> connection to machine_done event. However, chardev created later will
> no longer attempt to connect, and chardev created in tests do not have
> machine_done event (breaking some of vhost-user-test).
> 
> The goal was to move the "connect" source to the chardev frontend
> context (the monitor thread context in his case). chr->gcontext is set
> with qemu_chr_fe_set_handlers(). But there is no guarantee that the
> function will be called in general,

Could you hint a case where we didn't use qemu_chr_fe_set_handlers()
upon a chardev backend?  I thought it was always used in chardev
frontends, and what the backend could do if without a frontend?

[1]

> so we can't delay connection until
> then: the chardev should still attempt to connect during open(), using
> the main context.
> 
> An alternative would be to specify the iothread during chardev
> creation. Setting up monitor OOB would be quite different too, it
> would take the same iothread as argument.
> 
> 99f2f54174a595e is also a bit problematic, since it will behave
> differently before and after machine_done (the first case gives a
> chance to use a different context reliably, the second looks racy)
> 
> In the end, I am not sure this is all necessary, as chardev callbacks
> are called after qemu_chr_fe_set_handlers(), at which point the
> context of sources are updated. In "char-socket: update all ioc
> handlers when changing context", I moved also the hup handler to the
> updated context. So unless the main thread is already stuck, we can
> setup a different context for the chardev at that time. Or not?

IMHO the two patches that you reverted are special-cases for reasons.

The TLS handshake is carried out with an TLS internal GSource which is
not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't
update that GSource (please refer to qio_channel_tls_handshake_task).

The async connection is carried out in a standalone thread that calls
connect().  IMHO we'd better not update the gcontext bound to the
async task since otherwise there'll be a race (IIRC I proposed
something before using a mutex to update the gcontext, but Dan would
prefer not to, and I followed with the suggestion which makes sense to
me).

Could we just postpone these machine done tasks into
qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook,
just like what I mentioned in the other thread)?  Though we'll be sure
qemu_chr_fe_set_handlers() will be called for all chardev backends
hence I asked question [1] above.

Regards,

-- 
Peter Xu



[Qemu-devel] [PATCH 0/4] Fix socket chardev regression

2018-08-17 Thread Marc-André Lureau
Hi,

In commit 25679e5d58e "chardev: tcp: postpone async connection setup"
(and its follow up 99f2f54174a59), Peter moved chardev socket
connection to machine_done event. However, chardev created later will
no longer attempt to connect, and chardev created in tests do not have
machine_done event (breaking some of vhost-user-test).

The goal was to move the "connect" source to the chardev frontend
context (the monitor thread context in his case). chr->gcontext is set
with qemu_chr_fe_set_handlers(). But there is no guarantee that the
function will be called in general, so we can't delay connection until
then: the chardev should still attempt to connect during open(), using
the main context.

An alternative would be to specify the iothread during chardev
creation. Setting up monitor OOB would be quite different too, it
would take the same iothread as argument.

99f2f54174a595e is also a bit problematic, since it will behave
differently before and after machine_done (the first case gives a
chance to use a different context reliably, the second looks racy)

In the end, I am not sure this is all necessary, as chardev callbacks
are called after qemu_chr_fe_set_handlers(), at which point the
context of sources are updated. In "char-socket: update all ioc
handlers when changing context", I moved also the hup handler to the
updated context. So unless the main thread is already stuck, we can
setup a different context for the chardev at that time. Or not?

Marc-André Lureau (4):
  Revert "chardev: tcp: postpone TLS work until machine done"
  Revert "chardev: tcp: postpone async connection setup"
  char-socket: update all ioc handlers when changing context
  test-char: add socket reconnect test

 chardev/char-socket.c | 86 ++-
 tests/test-char.c | 18 +++--
 2 files changed, 50 insertions(+), 54 deletions(-)

-- 
2.18.0.547.g1d89318c48