On 2012-03-05 09:34, Paolo Bonzini wrote: > This is quite ugly. Two threads, one running main_loop_wait and > one running qemu_aio_wait, can race with each other on running the > same iohandler. The result is that an iohandler could run while the > underlying socket is not readable or writable, with possibly ill effects.
Hmm, isn't it a problem already that a socket is polled by two threads at the same time? Can't that be avoided? Long-term, I'd like to cut out certain file descriptors from the main loop and process them completely in separate threads (for separate locking, prioritization etc.). Dunno how NBD works, but maybe it should be reworked like this already. Jan > > This shows as a failure to boot an IDE disk using the NBD device. > We can consider it a bug in NBD or in the main loop. The patch fixes > this in main_loop_wait, which is always going to lose the race because > qemu_aio_wait runs select with the global lock held. > > Reported-by: Laurent Vivier <laur...@vivier.eu> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > --- > Anthony, if you think this is too ugly tell me and I can > post an NBD fix too. > > main-loop.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/main-loop.c b/main-loop.c > index db23de0..3beccff 100644 > --- a/main-loop.c > +++ b/main-loop.c > @@ -458,6 +458,13 @@ int main_loop_wait(int nonblocking) > > if (timeout > 0) { > qemu_mutex_lock_iothread(); > + > + /* Poll again. A qemu_aio_wait() on another thread > + * could have made the fdsets stale. > + */ > + tv.tv_sec = 0; > + tv.tv_usec = 0; > + ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv); > } > > glib_select_poll(&rfds, &wfds, &xfds, (ret < 0)); -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux