On Mon, 4 Jun 2018 16:15:23 +0100
Daniel P. Berrangé <berra...@redhat.com> wrote:

> On Mon, Jun 04, 2018 at 05:01:11PM +0200, Igor Mammedov wrote:
> > On Mon, 4 Jun 2018 16:21:48 +0200
> > Michal Privoznik <mpriv...@redhat.com> wrote:
> >   
> > > On 06/04/2018 02:03 PM, Daniel P. Berrangé wrote:  
> > > > When using --daemonize, the initial lead process will fork a child and
> > > > then wait to be notified that setup is complete via a pipe, before it
> > > > exits.  When using --preconfig there is an extra call to main_loop()
> > > > before the notification is done from os_setup_post(). Thus the parent
> > > > process won't exit until the mgmt application connects to the monitor
> > > > and tells QEMU to leave the RUN_STATE_PRECONFIG. The mgmt application
> > > > won't connect to the monitor until daemonizing has completed though.
> > > > 
> > > > This is a chicken and egg problem, leading to deadlock at startup.  
> > not calling 1st main_loop(), solves the issue if no --preconfig
> > is specified like Michal has suggested. So moving os_setup_post()
> > earlier isn't the only option.
> > 
> > With Michal's patch it should work fine with old libvirt versions,
> > however it would mean more changes to libvirt when adding
> > --preconfig option handling as it would need to connect to qmp
> > socket earlier if the option is used.  
> 
> This patch doesn't cause problems with old libvirt either - the opposite
> in fact, it fixes the problems that broke with old libvirt.
> 
> >   
> > > > The only viable way to fix this is to call os_setup_post() before
> > > > the early main_loop() call when in RUN_STATE_PRECONFIG. This has the
> > > > downside that any errors from this point onwards won't be handled
> > > > well by the mgmt application, because it will think QEMU has started
> > > > successfully, so not be expecting an abrupt exit. The only way to
> > > > deal with that is to move as much user input validation as possible
> > > > to before the main_loop() call. This is left as an exercise for
> > > > future interested developers.  
> > Moving post board input validation might be problematic as
> > it might require existing board to create a device so it could verify
> > user provided parameters.
> > 
> > Does mgmt application starts QEMU with log file where QEMU would
> > write errors if any after os_setup_post() and would mgmt app look
> > into it/report from it to user if QEMU disappears?  
> 
> Sure any app can redirect stdout/err to a file if it wishes.
> 
> --daemonize was actually also a synchronization point - once the parent
> returns, the mgmt app knows it can successfully connect to the QEMU
> monitor as the listener socket is guaranteed to be created by then.
> So moving the os_setup_post earlier as I did still gives that sync
> point, which is good. It just means when the mgmt app has to be aware
> that more errors might appear on stderr in the window until it exits
> preconfig phase.
> 
> Ideally there would be a way to feed them back via the monitor, but
> that's non-trivial, so doubt it'll happen any time in the forseeable
> future.
Question is if libvirt would notify user (in its logs) about error
after sync point?

> > > > Signed-off-by: Daniel P. Berrangé <berra...@redhat.com>
> > > > ---
> > > >  vl.c | 7 ++++++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)    
> > > 
> > > Yup, this fixes the problem I've raised in my patch. Thanks!  
> > I'd prefer your patch, as it doesn't break error handling,  
> 
> Michal's patch didn't actually fix the real problem. It simply avoided
> triggering the bug when --preconfig was not present - QEMU still hangs
> with --preconfig --daemonize are used together.
it's not exactly a bug, it's new behavior so mgmt could adapt to it,
i.e. lack of sync point (which certainly complicates mgmt part).

So this patch is also fine if libvirt would be able to provide
meaning-full error to users in case of problem post os_setup_post() stage.
In both cases it would require some work on mgmt part when adding support
for --precongig.


Reply via email to