> ---
> Is it expected that this non-blocking condition implies lockup of the
> iothread?

No.  The idea was to make the loop cheaper when you had a qemu_notify_event()
or bottom half, basically something that causes main_loop_wait() to wake up
immediately multiple times.  When that happens, it is cheaper to avoid
releasing and taking the mutex.

Can you check why main_loop_wait() is returning a non-zero value without
making any progress?

Paolo

> Diving deeper again, I notice that this non-blocking feature isn't
> even enabled at all for KVM. Which would probably mean that this bug
> is not replicable by anyone testing with KVM. We could just make all
> the CPU backends consistent with KVM by removing the nonblocking
> altogether. Any comments from TCG people :) ?
> 
> --- a/vl.c
> +++ b/vl.c
> @@ -2030,17 +2030,15 @@ static bool main_loop_should_exit(void)
> 
>  static void main_loop(void)
>  {
> -    bool nonblocking;
>      int last_io = 0;
>  #ifdef CONFIG_PROFILER
>      int64_t ti;
>  #endif
>      do {
> -        nonblocking = !kvm_enabled() && last_io > 0;
>  #ifdef CONFIG_PROFILER
>          ti = profile_getclock();
>  #endif
> -        last_io = main_loop_wait(nonblocking);
> +        last_io = main_loop_wait(0);
>  #ifdef CONFIG_PROFILER
>          dev_time += profile_getclock() - ti;
>  #endif
> 
> Not the mux's: if mux_chr_can_read()
> > returns zero, the prepare function returns FALSE without touching the
> > timeout at all...
> >
> > static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_)
> > {
> >     IOWatchPoll *iwp = io_watch_poll_from_source(source);
> >
> >     iwp->max_size = iwp->fd_can_read(iwp->opaque);
> >     if (iwp->max_size == 0) {
> >         return FALSE;
> >     }
> >
> >     return g_io_watch_funcs.prepare(source, timeout_);
> > }
> >
> >> - Timeout means no unlock of IOthread. Device land never sees any more
> >>       cycles so the serial port never progresses - no flushing of
> >>       buffer
> >
> > Still, this is plausible, so the patch looks correct.
> >
> 
> Ok,
> 
> Ill give it some more time and fix commit message. if we don't figure
> out a better patch.
> 
> Regards,
> Peter
> 
> > Paolo
> >
> >> - Deadlock
> >>
> >> Tested on petalogix_ml605 microblazeel machine model, which was faulty
> >> due to 1154328.
> >>
> >> Fix by removing the conditions on unlocking the iothread. Don't know
> >> what else this will break but the timeout is certainly the wrong
> >> condition for the unlock. Probably the real solution is to have a more
> >> selective unlock policy.
> >>
> >> I'm happy for someone to take this patch off my hands, or educate me on
> >> the correct implementation. For the peeps doing automated testing on
> >> nographic platforms this will get your build working again.
> >>
> >> Signed-off-by: Peter Crosthwaite <peter.crosthwa...@xilinx.com>
> >> ---
> >>  main-loop.c |    8 ++------
> >>  1 files changed, 2 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/main-loop.c b/main-loop.c
> >> index eb80ff3..a376898 100644
> >> --- a/main-loop.c
> >> +++ b/main-loop.c
> >> @@ -194,15 +194,11 @@ static int os_host_main_loop_wait(uint32_t timeout)
> >>
> >>      glib_pollfds_fill(&timeout);
> >>
> >> -    if (timeout > 0) {
> >> -        qemu_mutex_unlock_iothread();
> >> -    }
> >> +    qemu_mutex_unlock_iothread();
> >>
> >>      ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout);
> >>
> >> -    if (timeout > 0) {
> >> -        qemu_mutex_lock_iothread();
> >> -    }
> >> +    qemu_mutex_lock_iothread();
> >>
> >>      glib_pollfds_poll();
> >>      return ret;
> >>
> >
> >
> 

Reply via email to