simple: fix hang in child after fork(2)

Eric Blake Tue, 24 Jul 2018 08:17:35 -0700

On 07/24/2018 09:25 AM, Stefan Hajnoczi wrote:

The simple trace backend spawns a write-out thread which is used to
asynchronously flush the in-memory ring buffer to disk.


fork(2) does not clone all threads, only the thread that invoked
fork(2).  As a result there is no write-out thread in the child process!

This causes a hang during shutdown when atexit(3) handler installed by
the simple trace backend waits for the non-existent write-out thread.

This patch uses pthread_atfork(3) to terminate the write-out thread
before fork and restart it in both the parent and child after fork.
This solves a hang in qemu-iotests 147 due to qemu-nbd --fork usage.

Reported-by: Cornelia Huck <coh...@redhat.com>
Tested-by: Cornelia Huck <coh...@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
Message-id: 20180717101944.11691-1-stefa...@redhat.com
Suggested-by: Paolo Bonzini <pbonz...@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
---

+static void restart_writeout_thread(void)
+{
+    trace_writeout_running = true;
+    trace_writeout_thread = trace_thread_create(writeout_thread);
+    if (!trace_writeout_thread) {
+        warn_report("unable to initialize simple trace backend");
+    }
+
+    /* This relies on undefined behavior in the fork() child (it's fine in the
+     * fork() parent).  g_mutex_unlock() on a mutex acquired by another thread
+     * is undefined (see glib documentation).
+     */
+    g_mutex_unlock(&trace_lock);

Dan's point about stopping tracing prior to fork, then restarting itfrom scratch in both the parent and in specific children, would also getrid of this risky non-portable behavior of trying to manipulate a mutexacquired by the parent process' thread.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PULL for-3.0 1/1] trace/simple: fix hang in child after fork(2)

Reply via email to