On Tue, 22 Mar 2016 13:11:05 +0100 Paolo Bonzini <pbonz...@redhat.com> wrote:
> On 22/03/2016 12:59, Cornelia Huck wrote: > >> > They can be fixed with just an extra object_ref/object_unref. > >> > > >> > I didn't understand that Tu Bo also needed the BH fix, and with that > >> > information it makes sense. Passing the assign value ensures that > >> > ioeventfd remains always assigned. With the CPU threads out of the > >> > picture, the BH becomes enough to make everything thread-safe. > > Yes, this makes sense. > > > > Might we still have a hole somewhere in dataplane teardown? Probably > > not, from reading the code, even if it runs in cpu thread context. > > The bug arises when the main thread sets started = true, a CPU thread > comes along while the ioeventfd is reset, and as soon as the BQL is > released by the main thread the CPU thread thinks it is a dataplane > thread. This does horrible things to non-reentrant code. For stop we > should be safe because the CPU thread is the one that sets started = false. > > IOW, we should be safe as long as the ioeventfd is never unassigned > (your fix) _and_ we ensure serialization between threads that touch > dataplane_started (Fam's fix). We should really add something like that explanation to the changelog so that future generations may understand what's going on here :) So, what do we do for 2.6? A respin of Fam's fix + my refactoring (with some interface doc added)? I'd still like some reviews and maybe a test on virtio-mmio.