Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On Wed, Aug 22, 2018 at 11:46:32AM +0800, Peter Xu wrote: > On Tue, Aug 21, 2018 at 04:16:27PM +0200, Paolo Bonzini wrote: > > On 21/08/2018 16:04, Marc-André Lureau wrote: > > >> If you don't like the way I proposed, another thing I am > > >> thinking is that whether we can assign the gcontext for the chardev > > >> backend before initialization of it (or by parsing the backend & > > >> frontend relationships before init of backends), then we assure that > > >> we never change the gcontext of any chardev backends. Though that > > > Yes, I think that's a cleaner solution. I suggested to use an iothread > > > argument in the cover letter. > > > > That would be nice, but isn't it already too late for the monitor chardev? > > I think, yes, if we want to do this automatically. Sorry I forgot to mention some more details here. If to do it automatically, IMHO we can just split the mon_init_func() into parsing and init, then we do: (1) parse monitors, setup gcontext/... for chardev (2) init chardevs with the gcontext/... setup (3) init monitors Benefit is that we don't need to introduce new interface then. > Though if as > Marc-André suggested (which I didn't really notice first when reading > the cover letter), then maybe that's not a problem since user need to > manually specify the iothread for a chardev backend, then chardev > context will not depend on monitor code any more. > > Marc-André, do you want to propose your iothread interface? That > should be the easy way AFAIU, though that'll make the command line for > monitor out-of-band much longer (but it seems fine at least to me). > > Adding Markus too. > > > > > In any case, I don't see a reason to dislike this patch, especially > > since it comes with a testcase. > > AFAIU the test case didn't really test the non-NULL gcontext case, so > it fixed A (vhost-user reconnect) however it might break B (non-NULL > gcontext with a potential race). > > Regards, > > -- > Peter Xu Regards, -- Peter Xu
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On Tue, Aug 21, 2018 at 04:16:27PM +0200, Paolo Bonzini wrote: > On 21/08/2018 16:04, Marc-André Lureau wrote: > >> If you don't like the way I proposed, another thing I am > >> thinking is that whether we can assign the gcontext for the chardev > >> backend before initialization of it (or by parsing the backend & > >> frontend relationships before init of backends), then we assure that > >> we never change the gcontext of any chardev backends. Though that > > Yes, I think that's a cleaner solution. I suggested to use an iothread > > argument in the cover letter. > > That would be nice, but isn't it already too late for the monitor chardev? I think, yes, if we want to do this automatically. Though if as Marc-André suggested (which I didn't really notice first when reading the cover letter), then maybe that's not a problem since user need to manually specify the iothread for a chardev backend, then chardev context will not depend on monitor code any more. Marc-André, do you want to propose your iothread interface? That should be the easy way AFAIU, though that'll make the command line for monitor out-of-band much longer (but it seems fine at least to me). Adding Markus too. > > In any case, I don't see a reason to dislike this patch, especially > since it comes with a testcase. AFAIU the test case didn't really test the non-NULL gcontext case, so it fixed A (vhost-user reconnect) however it might break B (non-NULL gcontext with a potential race). Regards, -- Peter Xu
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On Tue, Aug 21, 2018 at 04:04:45PM +0200, Marc-André Lureau wrote: > Hi > > On Tue, Aug 21, 2018 at 8:29 AM Peter Xu wrote: > > I fully agree that current way is not ideal since basically the > > backend should not depend on the frontend, but now we have the > > gcontext as an exception then the backend will somehow depend on the > > frontend. If you don't like the way I proposed, another thing I am > > thinking is that whether we can assign the gcontext for the chardev > > backend before initialization of it (or by parsing the backend & > > frontend relationships before init of backends), then we assure that > > we never change the gcontext of any chardev backends. Though that > > Yes, I think that's a cleaner solution. I suggested to use an iothread > argument in the cover letter. > > Paolo, Daniel, any opinion? Never changing the GContext once initialized is much nicer. I think I had suggested that before when this code was first proposed. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On 21/08/2018 16:04, Marc-André Lureau wrote: >> If you don't like the way I proposed, another thing I am >> thinking is that whether we can assign the gcontext for the chardev >> backend before initialization of it (or by parsing the backend & >> frontend relationships before init of backends), then we assure that >> we never change the gcontext of any chardev backends. Though that > Yes, I think that's a cleaner solution. I suggested to use an iothread > argument in the cover letter. That would be nice, but isn't it already too late for the monitor chardev? In any case, I don't see a reason to dislike this patch, especially since it comes with a testcase. Paolo
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
Hi On Tue, Aug 21, 2018 at 8:29 AM Peter Xu wrote: > > On Mon, Aug 20, 2018 at 05:37:55PM +0200, Marc-André Lureau wrote: > > Hi > > > > On Mon, Aug 20, 2018 at 4:48 AM Peter Xu wrote: > > > > > > On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote: > > > > Hi, > > > > > > > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup" > > > > (and its follow up 99f2f54174a59), Peter moved chardev socket > > > > connection to machine_done event. However, chardev created later will > > > > no longer attempt to connect, and chardev created in tests do not have > > > > machine_done event (breaking some of vhost-user-test). > > > > > > > > The goal was to move the "connect" source to the chardev frontend > > > > context (the monitor thread context in his case). chr->gcontext is set > > > > with qemu_chr_fe_set_handlers(). But there is no guarantee that the > > > > function will be called in general, > > > > > > Could you hint a case where we didn't use qemu_chr_fe_set_handlers() > > > upon a chardev backend? I thought it was always used in chardev > > > frontends, and what the backend could do if without a frontend? > > > > Well, you don't have to have a front-end to have side effects. Connect > > will be attempted even without frontend. We may have users expecting > > that behaviour, that might be considered a break if we change it. > > > > (and unlikely, there might be frontends that are write only) > > My understanding is that qemu_chr_fe_set_handlers() is not only for > port read, but also for the rest. For example, we need to pass in the > correct IOEventHandler* to handle chardev backend events even if the > frontend only writes. > > > > > > > > > [1] > > > > > > > so we can't delay connection until > > > > then: the chardev should still attempt to connect during open(), using > > > > the main context. > > > > > > > > An alternative would be to specify the iothread during chardev > > > > creation. Setting up monitor OOB would be quite different too, it > > > > would take the same iothread as argument. > > > > > > > > 99f2f54174a595e is also a bit problematic, since it will behave > > > > differently before and after machine_done (the first case gives a > > > > chance to use a different context reliably, the second looks racy) > > > > > > > > In the end, I am not sure this is all necessary, as chardev callbacks > > > > are called after qemu_chr_fe_set_handlers(), at which point the > > > > context of sources are updated. In "char-socket: update all ioc > > > > handlers when changing context", I moved also the hup handler to the > > > > updated context. So unless the main thread is already stuck, we can > > > > setup a different context for the chardev at that time. Or not? > > > > > > IMHO the two patches that you reverted are special-cases for reasons. > > > > > > The TLS handshake is carried out with an TLS internal GSource which is > > > not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't > > > update that GSource (please refer to qio_channel_tls_handshake_task). > > > > What can go wrong by using the default context for initial connection > > and TLS handshake? > > > > Presumably, you have a case where the mainloop is no longer processed > > and that will hang the chardev? > > Yeah I don't see a big problem now, but I'm not sure. Actually it > should not be very hard to even migrate this one just like other > GSources, however the async one below should be a bit harder. > > > > > > The async connection is carried out in a standalone thread that calls > > > connect(). IMHO we'd better not update the gcontext bound to the > > > async task since otherwise there'll be a race (IIRC I proposed > > > something before using a mutex to update the gcontext, but Dan would > > > prefer not to, and I followed with the suggestion which makes sense to > > > me). > > > > > > Could we just postpone these machine done tasks into > > > qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook, > > > just like what I mentioned in the other thread)? Though we'll be sure > > > qemu_chr_fe_set_handlers() will be called for all chardev backends > > > hence I asked question [1] above. > > > > I would rather not to, if possible. unless we take the risk of > > breaking current behaviour and review chardev usage in qemu. > > Yeah, I'd be glad to know any of the behavior breakage if there is, > but I can't figure any out. AFAIU there should be none since we > should always be pairing a backend with a frontend. Even if we check all usage of chardev in internal qemu, there might be external users that expect that creating a chardev will attempt the connection immediately. > > I fully agree that current way is not ideal since basically the > backend should not depend on the frontend, but now we have the > gcontext as an exception then the backend will somehow depend on the > frontend. If you don't like the way I proposed, another thing I am > thinking is that whether we
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On Mon, Aug 20, 2018 at 05:37:55PM +0200, Marc-André Lureau wrote: > Hi > > On Mon, Aug 20, 2018 at 4:48 AM Peter Xu wrote: > > > > On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote: > > > Hi, > > > > > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup" > > > (and its follow up 99f2f54174a59), Peter moved chardev socket > > > connection to machine_done event. However, chardev created later will > > > no longer attempt to connect, and chardev created in tests do not have > > > machine_done event (breaking some of vhost-user-test). > > > > > > The goal was to move the "connect" source to the chardev frontend > > > context (the monitor thread context in his case). chr->gcontext is set > > > with qemu_chr_fe_set_handlers(). But there is no guarantee that the > > > function will be called in general, > > > > Could you hint a case where we didn't use qemu_chr_fe_set_handlers() > > upon a chardev backend? I thought it was always used in chardev > > frontends, and what the backend could do if without a frontend? > > Well, you don't have to have a front-end to have side effects. Connect > will be attempted even without frontend. We may have users expecting > that behaviour, that might be considered a break if we change it. > > (and unlikely, there might be frontends that are write only) My understanding is that qemu_chr_fe_set_handlers() is not only for port read, but also for the rest. For example, we need to pass in the correct IOEventHandler* to handle chardev backend events even if the frontend only writes. > > > > > [1] > > > > > so we can't delay connection until > > > then: the chardev should still attempt to connect during open(), using > > > the main context. > > > > > > An alternative would be to specify the iothread during chardev > > > creation. Setting up monitor OOB would be quite different too, it > > > would take the same iothread as argument. > > > > > > 99f2f54174a595e is also a bit problematic, since it will behave > > > differently before and after machine_done (the first case gives a > > > chance to use a different context reliably, the second looks racy) > > > > > > In the end, I am not sure this is all necessary, as chardev callbacks > > > are called after qemu_chr_fe_set_handlers(), at which point the > > > context of sources are updated. In "char-socket: update all ioc > > > handlers when changing context", I moved also the hup handler to the > > > updated context. So unless the main thread is already stuck, we can > > > setup a different context for the chardev at that time. Or not? > > > > IMHO the two patches that you reverted are special-cases for reasons. > > > > The TLS handshake is carried out with an TLS internal GSource which is > > not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't > > update that GSource (please refer to qio_channel_tls_handshake_task). > > What can go wrong by using the default context for initial connection > and TLS handshake? > > Presumably, you have a case where the mainloop is no longer processed > and that will hang the chardev? Yeah I don't see a big problem now, but I'm not sure. Actually it should not be very hard to even migrate this one just like other GSources, however the async one below should be a bit harder. > > > The async connection is carried out in a standalone thread that calls > > connect(). IMHO we'd better not update the gcontext bound to the > > async task since otherwise there'll be a race (IIRC I proposed > > something before using a mutex to update the gcontext, but Dan would > > prefer not to, and I followed with the suggestion which makes sense to > > me). > > > > Could we just postpone these machine done tasks into > > qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook, > > just like what I mentioned in the other thread)? Though we'll be sure > > qemu_chr_fe_set_handlers() will be called for all chardev backends > > hence I asked question [1] above. > > I would rather not to, if possible. unless we take the risk of > breaking current behaviour and review chardev usage in qemu. Yeah, I'd be glad to know any of the behavior breakage if there is, but I can't figure any out. AFAIU there should be none since we should always be pairing a backend with a frontend. I fully agree that current way is not ideal since basically the backend should not depend on the frontend, but now we have the gcontext as an exception then the backend will somehow depend on the frontend. If you don't like the way I proposed, another thing I am thinking is that whether we can assign the gcontext for the chardev backend before initialization of it (or by parsing the backend & frontend relationships before init of backends), then we assure that we never change the gcontext of any chardev backends. Though that will require that we need to setup all possible gcontexts before hand (e.g., the monitor gcontext). Then we can drop all these dynamic binding magics (but just
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
Hi On Mon, Aug 20, 2018 at 4:48 AM Peter Xu wrote: > > On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote: > > Hi, > > > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup" > > (and its follow up 99f2f54174a59), Peter moved chardev socket > > connection to machine_done event. However, chardev created later will > > no longer attempt to connect, and chardev created in tests do not have > > machine_done event (breaking some of vhost-user-test). > > > > The goal was to move the "connect" source to the chardev frontend > > context (the monitor thread context in his case). chr->gcontext is set > > with qemu_chr_fe_set_handlers(). But there is no guarantee that the > > function will be called in general, > > Could you hint a case where we didn't use qemu_chr_fe_set_handlers() > upon a chardev backend? I thought it was always used in chardev > frontends, and what the backend could do if without a frontend? Well, you don't have to have a front-end to have side effects. Connect will be attempted even without frontend. We may have users expecting that behaviour, that might be considered a break if we change it. (and unlikely, there might be frontends that are write only) > > [1] > > > so we can't delay connection until > > then: the chardev should still attempt to connect during open(), using > > the main context. > > > > An alternative would be to specify the iothread during chardev > > creation. Setting up monitor OOB would be quite different too, it > > would take the same iothread as argument. > > > > 99f2f54174a595e is also a bit problematic, since it will behave > > differently before and after machine_done (the first case gives a > > chance to use a different context reliably, the second looks racy) > > > > In the end, I am not sure this is all necessary, as chardev callbacks > > are called after qemu_chr_fe_set_handlers(), at which point the > > context of sources are updated. In "char-socket: update all ioc > > handlers when changing context", I moved also the hup handler to the > > updated context. So unless the main thread is already stuck, we can > > setup a different context for the chardev at that time. Or not? > > IMHO the two patches that you reverted are special-cases for reasons. > > The TLS handshake is carried out with an TLS internal GSource which is > not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't > update that GSource (please refer to qio_channel_tls_handshake_task). What can go wrong by using the default context for initial connection and TLS handshake? Presumably, you have a case where the mainloop is no longer processed and that will hang the chardev? > The async connection is carried out in a standalone thread that calls > connect(). IMHO we'd better not update the gcontext bound to the > async task since otherwise there'll be a race (IIRC I proposed > something before using a mutex to update the gcontext, but Dan would > prefer not to, and I followed with the suggestion which makes sense to > me). > > Could we just postpone these machine done tasks into > qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook, > just like what I mentioned in the other thread)? Though we'll be sure > qemu_chr_fe_set_handlers() will be called for all chardev backends > hence I asked question [1] above. I would rather not to, if possible. unless we take the risk of breaking current behaviour and review chardev usage in qemu. -- Marc-André Lureau
Re: [Qemu-devel] [PATCH 0/4] Fix socket chardev regression
On Fri, Aug 17, 2018 at 03:52:20PM +0200, Marc-André Lureau wrote: > Hi, > > In commit 25679e5d58e "chardev: tcp: postpone async connection setup" > (and its follow up 99f2f54174a59), Peter moved chardev socket > connection to machine_done event. However, chardev created later will > no longer attempt to connect, and chardev created in tests do not have > machine_done event (breaking some of vhost-user-test). > > The goal was to move the "connect" source to the chardev frontend > context (the monitor thread context in his case). chr->gcontext is set > with qemu_chr_fe_set_handlers(). But there is no guarantee that the > function will be called in general, Could you hint a case where we didn't use qemu_chr_fe_set_handlers() upon a chardev backend? I thought it was always used in chardev frontends, and what the backend could do if without a frontend? [1] > so we can't delay connection until > then: the chardev should still attempt to connect during open(), using > the main context. > > An alternative would be to specify the iothread during chardev > creation. Setting up monitor OOB would be quite different too, it > would take the same iothread as argument. > > 99f2f54174a595e is also a bit problematic, since it will behave > differently before and after machine_done (the first case gives a > chance to use a different context reliably, the second looks racy) > > In the end, I am not sure this is all necessary, as chardev callbacks > are called after qemu_chr_fe_set_handlers(), at which point the > context of sources are updated. In "char-socket: update all ioc > handlers when changing context", I moved also the hup handler to the > updated context. So unless the main thread is already stuck, we can > setup a different context for the chardev at that time. Or not? IMHO the two patches that you reverted are special-cases for reasons. The TLS handshake is carried out with an TLS internal GSource which is not owned by the chardev code, so the qemu_chr_fe_set_handlers() won't update that GSource (please refer to qio_channel_tls_handshake_task). The async connection is carried out in a standalone thread that calls connect(). IMHO we'd better not update the gcontext bound to the async task since otherwise there'll be a race (IIRC I proposed something before using a mutex to update the gcontext, but Dan would prefer not to, and I followed with the suggestion which makes sense to me). Could we just postpone these machine done tasks into qemu_chr_fe_set_handlers() (or say, chr_update_read_handler() hook, just like what I mentioned in the other thread)? Though we'll be sure qemu_chr_fe_set_handlers() will be called for all chardev backends hence I asked question [1] above. Regards, -- Peter Xu
[Qemu-devel] [PATCH 0/4] Fix socket chardev regression
Hi, In commit 25679e5d58e "chardev: tcp: postpone async connection setup" (and its follow up 99f2f54174a59), Peter moved chardev socket connection to machine_done event. However, chardev created later will no longer attempt to connect, and chardev created in tests do not have machine_done event (breaking some of vhost-user-test). The goal was to move the "connect" source to the chardev frontend context (the monitor thread context in his case). chr->gcontext is set with qemu_chr_fe_set_handlers(). But there is no guarantee that the function will be called in general, so we can't delay connection until then: the chardev should still attempt to connect during open(), using the main context. An alternative would be to specify the iothread during chardev creation. Setting up monitor OOB would be quite different too, it would take the same iothread as argument. 99f2f54174a595e is also a bit problematic, since it will behave differently before and after machine_done (the first case gives a chance to use a different context reliably, the second looks racy) In the end, I am not sure this is all necessary, as chardev callbacks are called after qemu_chr_fe_set_handlers(), at which point the context of sources are updated. In "char-socket: update all ioc handlers when changing context", I moved also the hup handler to the updated context. So unless the main thread is already stuck, we can setup a different context for the chardev at that time. Or not? Marc-André Lureau (4): Revert "chardev: tcp: postpone TLS work until machine done" Revert "chardev: tcp: postpone async connection setup" char-socket: update all ioc handlers when changing context test-char: add socket reconnect test chardev/char-socket.c | 86 ++- tests/test-char.c | 18 +++-- 2 files changed, 50 insertions(+), 54 deletions(-) -- 2.18.0.547.g1d89318c48