> --- > Is it expected that this non-blocking condition implies lockup of the > iothread?
No. The idea was to make the loop cheaper when you had a qemu_notify_event() or bottom half, basically something that causes main_loop_wait() to wake up immediately multiple times. When that happens, it is cheaper to avoid releasing and taking the mutex. Can you check why main_loop_wait() is returning a non-zero value without making any progress? Paolo > Diving deeper again, I notice that this non-blocking feature isn't > even enabled at all for KVM. Which would probably mean that this bug > is not replicable by anyone testing with KVM. We could just make all > the CPU backends consistent with KVM by removing the nonblocking > altogether. Any comments from TCG people :) ? > > --- a/vl.c > +++ b/vl.c > @@ -2030,17 +2030,15 @@ static bool main_loop_should_exit(void) > > static void main_loop(void) > { > - bool nonblocking; > int last_io = 0; > #ifdef CONFIG_PROFILER > int64_t ti; > #endif > do { > - nonblocking = !kvm_enabled() && last_io > 0; > #ifdef CONFIG_PROFILER > ti = profile_getclock(); > #endif > - last_io = main_loop_wait(nonblocking); > + last_io = main_loop_wait(0); > #ifdef CONFIG_PROFILER > dev_time += profile_getclock() - ti; > #endif > > Not the mux's: if mux_chr_can_read() > > returns zero, the prepare function returns FALSE without touching the > > timeout at all... > > > > static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_) > > { > > IOWatchPoll *iwp = io_watch_poll_from_source(source); > > > > iwp->max_size = iwp->fd_can_read(iwp->opaque); > > if (iwp->max_size == 0) { > > return FALSE; > > } > > > > return g_io_watch_funcs.prepare(source, timeout_); > > } > > > >> - Timeout means no unlock of IOthread. Device land never sees any more > >> cycles so the serial port never progresses - no flushing of > >> buffer > > > > Still, this is plausible, so the patch looks correct. > > > > Ok, > > Ill give it some more time and fix commit message. if we don't figure > out a better patch. > > Regards, > Peter > > > Paolo > > > >> - Deadlock > >> > >> Tested on petalogix_ml605 microblazeel machine model, which was faulty > >> due to 1154328. > >> > >> Fix by removing the conditions on unlocking the iothread. Don't know > >> what else this will break but the timeout is certainly the wrong > >> condition for the unlock. Probably the real solution is to have a more > >> selective unlock policy. > >> > >> I'm happy for someone to take this patch off my hands, or educate me on > >> the correct implementation. For the peeps doing automated testing on > >> nographic platforms this will get your build working again. > >> > >> Signed-off-by: Peter Crosthwaite <peter.crosthwa...@xilinx.com> > >> --- > >> main-loop.c | 8 ++------ > >> 1 files changed, 2 insertions(+), 6 deletions(-) > >> > >> diff --git a/main-loop.c b/main-loop.c > >> index eb80ff3..a376898 100644 > >> --- a/main-loop.c > >> +++ b/main-loop.c > >> @@ -194,15 +194,11 @@ static int os_host_main_loop_wait(uint32_t timeout) > >> > >> glib_pollfds_fill(&timeout); > >> > >> - if (timeout > 0) { > >> - qemu_mutex_unlock_iothread(); > >> - } > >> + qemu_mutex_unlock_iothread(); > >> > >> ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout); > >> > >> - if (timeout > 0) { > >> - qemu_mutex_lock_iothread(); > >> - } > >> + qemu_mutex_lock_iothread(); > >> > >> glib_pollfds_poll(); > >> return ret; > >> > > > > >