Re: [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
On 2016-03-23 17:27, Alex Bennée wrote: > > KONRAD Frederic writes: > >> Hi Alex, >> >> Thanks for having pulling all that together! >> About this patch the original author was Jan Kiszka >> (https://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03323.html) >> >> This has probably been dropped during a rebase. > > I'll amend the author on v2. > > Jan, > > The original didn't have a s-o-b tag from you. Are you happy to give one > for its current iteration? You can consider it Signed-off-by: Jan Kiszka At that point, I didn't consider it ready for upstream (see the log), so I left out the signature. Jan > >> >> Thanks, >> Fred >> >> Le 18/03/2016 17:18, Alex Bennée a écrit : >>> From: KONRAD Frederic >>> >>> This finally allows TCG to benefit from the iothread introduction: Drop >>> the global mutex while running pure TCG CPU code. Reacquire the lock >>> when entering MMIO or PIO emulation, or when leaving the TCG loop. >>> >>> We have to revert a few optimization for the current TCG threading >>> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not >>> kicking it in qemu_cpu_kick. We also need to disable RAM block >>> reordering until we have a more efficient locking mechanism at hand. >>> >>> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. >>> These numbers demonstrate where we gain something: >>> >>> 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm >>> 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm >>> >>> The guest CPU was fully loaded, but the iothread could still run mostly >>> independent on a second core. Without the patch we don't get beyond >>> >>> 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm >>> 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm >>> >>> We don't benefit significantly, though, when the guest is not fully >>> loading a host CPU. >>> >>> Message-Id: <1439220437-23957-10-git-send-email-fred.kon...@greensocs.com> >>> Signed-off-by: KONRAD Frederic >>> [AJB: -smp single-threaded fix, rm old info from commit msg] >>> Signed-off-by: Alex Bennée >>> >>> --- >>> v1 (ajb): >>>- SMP failure now fixed by previous commit >>> Changes from Fred Konrad (mttcg-v7 via paolo): >>>* Rebase on the current HEAD. >>>* Fixes a deadlock in qemu_devices_reset(). >>>* Remove the mutex in address_space_* >>> --- >>> cpu-exec.c | 10 ++ >>> cpus.c | 26 +++--- >>> cputlb.c | 4 >>> exec.c | 12 >>> hw/i386/kvmvapic.c | 3 +++ >>> softmmu_template.h | 17 + >>> translate-all.c| 11 +-- >>> 7 files changed, 58 insertions(+), 25 deletions(-) >>> >>> diff --git a/cpu-exec.c b/cpu-exec.c >>> index 3572256..76891fd 100644 >>> --- a/cpu-exec.c >>> +++ b/cpu-exec.c >>> @@ -28,6 +28,7 @@ >>> #include "qemu/rcu.h" >>> #include "exec/tb-hash.h" >>> #include "exec/log.h" >>> +#include "qemu/main-loop.h" >>> #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) >>> #include "hw/i386/apic.h" >>> #endif >>> @@ -449,6 +450,13 @@ int cpu_exec(CPUState *cpu) >>> for(;;) { >>> interrupt_request = cpu->interrupt_request; >>> if (unlikely(interrupt_request)) { >>> +/* FIXME: this needs to take the iothread lock. >>> + * For this we need to find all places in >>> + * cc->cpu_exec_interrupt that can call cpu_loop_exit, >>> + * and call qemu_unlock_iothread_mutex() there. Else, >>> + * add a flag telling cpu_loop_exit() to unlock it. >>> + */ >>> + >>> if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) { >>> /* Mask out external interrupts for this step. */ >>> interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK; >>> @@ -501,6 +509,7 @@ int cpu_exec(CPUState *cpu) >>> the program flow was changed */ >>> next_tb = 0; >>> } >>> + >>> } >>> if (unlikely(cpu->exit_request >>>|| replay_has_interrupt())) { >>> @@ -618,6 +627,7 @@ int cpu_exec(CPUState *cpu) >>> g_assert(env == &x86_cpu->env); >>> #endif >>> #endif /* buggy compiler */ >>> + >>> cpu->can_do_io = 1; >>> tb_lock_reset(); >>> } >>> diff --git a/cpus.c b/cpus.c >>> index a87fbf9..0e3d5f9 100644 >>> --- a/cpus.c >>> +++ b/cpus.c >>> @@ -911,8 +911,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu) >>> #endif /* _WIN32 */ >>> >>> static QemuMutex qemu_global_mutex; >>> -static QemuCond qemu_io_proceeded_cond; >>> -static unsigned iothread_requesting_mutex; >>> >>> static QemuThread io_thread; >>> >>> @@ -9
Re: [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
KONRAD Frederic writes: > Hi Alex, > > Thanks for having pulling all that together! > About this patch the original author was Jan Kiszka > (https://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03323.html) > > This has probably been dropped during a rebase. I'll amend the author on v2. Jan, The original didn't have a s-o-b tag from you. Are you happy to give one for its current iteration? > > Thanks, > Fred > > Le 18/03/2016 17:18, Alex Bennée a écrit : >> From: KONRAD Frederic >> >> This finally allows TCG to benefit from the iothread introduction: Drop >> the global mutex while running pure TCG CPU code. Reacquire the lock >> when entering MMIO or PIO emulation, or when leaving the TCG loop. >> >> We have to revert a few optimization for the current TCG threading >> model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not >> kicking it in qemu_cpu_kick. We also need to disable RAM block >> reordering until we have a more efficient locking mechanism at hand. >> >> Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. >> These numbers demonstrate where we gain something: >> >> 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm >> 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm >> >> The guest CPU was fully loaded, but the iothread could still run mostly >> independent on a second core. Without the patch we don't get beyond >> >> 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm >> 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm >> >> We don't benefit significantly, though, when the guest is not fully >> loading a host CPU. >> >> Message-Id: <1439220437-23957-10-git-send-email-fred.kon...@greensocs.com> >> Signed-off-by: KONRAD Frederic >> [AJB: -smp single-threaded fix, rm old info from commit msg] >> Signed-off-by: Alex Bennée >> >> --- >> v1 (ajb): >>- SMP failure now fixed by previous commit >> Changes from Fred Konrad (mttcg-v7 via paolo): >>* Rebase on the current HEAD. >>* Fixes a deadlock in qemu_devices_reset(). >>* Remove the mutex in address_space_* >> --- >> cpu-exec.c | 10 ++ >> cpus.c | 26 +++--- >> cputlb.c | 4 >> exec.c | 12 >> hw/i386/kvmvapic.c | 3 +++ >> softmmu_template.h | 17 + >> translate-all.c| 11 +-- >> 7 files changed, 58 insertions(+), 25 deletions(-) >> >> diff --git a/cpu-exec.c b/cpu-exec.c >> index 3572256..76891fd 100644 >> --- a/cpu-exec.c >> +++ b/cpu-exec.c >> @@ -28,6 +28,7 @@ >> #include "qemu/rcu.h" >> #include "exec/tb-hash.h" >> #include "exec/log.h" >> +#include "qemu/main-loop.h" >> #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) >> #include "hw/i386/apic.h" >> #endif >> @@ -449,6 +450,13 @@ int cpu_exec(CPUState *cpu) >> for(;;) { >> interrupt_request = cpu->interrupt_request; >> if (unlikely(interrupt_request)) { >> +/* FIXME: this needs to take the iothread lock. >> + * For this we need to find all places in >> + * cc->cpu_exec_interrupt that can call cpu_loop_exit, >> + * and call qemu_unlock_iothread_mutex() there. Else, >> + * add a flag telling cpu_loop_exit() to unlock it. >> + */ >> + >> if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) { >> /* Mask out external interrupts for this step. */ >> interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK; >> @@ -501,6 +509,7 @@ int cpu_exec(CPUState *cpu) >> the program flow was changed */ >> next_tb = 0; >> } >> + >> } >> if (unlikely(cpu->exit_request >>|| replay_has_interrupt())) { >> @@ -618,6 +627,7 @@ int cpu_exec(CPUState *cpu) >> g_assert(env == &x86_cpu->env); >> #endif >> #endif /* buggy compiler */ >> + >> cpu->can_do_io = 1; >> tb_lock_reset(); >> } >> diff --git a/cpus.c b/cpus.c >> index a87fbf9..0e3d5f9 100644 >> --- a/cpus.c >> +++ b/cpus.c >> @@ -911,8 +911,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu) >> #endif /* _WIN32 */ >> >> static QemuMutex qemu_global_mutex; >> -static QemuCond qemu_io_proceeded_cond; >> -static unsigned iothread_requesting_mutex; >> >> static QemuThread io_thread; >> >> @@ -928,7 +926,6 @@ void qemu_init_cpu_loop(void) >> qemu_cond_init(&qemu_cpu_cond); >> qemu_cond_init(&qemu_pause_cond); >> qemu_cond_init(&qemu_work_cond); >> -qemu_cond_init(&qemu_io_proceeded_cond); >> qemu_mutex_init(&qemu_global_mutex); >> >> qemu_thread_get_self(&io_thread); >> @@ -1043,10 +1040,
Re: [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
Hi Alex, Thanks for having pulling all that together! About this patch the original author was Jan Kiszka (https://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03323.html) This has probably been dropped during a rebase. Thanks, Fred Le 18/03/2016 17:18, Alex Bennée a écrit : From: KONRAD Frederic This finally allows TCG to benefit from the iothread introduction: Drop the global mutex while running pure TCG CPU code. Reacquire the lock when entering MMIO or PIO emulation, or when leaving the TCG loop. We have to revert a few optimization for the current TCG threading model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not kicking it in qemu_cpu_kick. We also need to disable RAM block reordering until we have a more efficient locking mechanism at hand. Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. These numbers demonstrate where we gain something: 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm The guest CPU was fully loaded, but the iothread could still run mostly independent on a second core. Without the patch we don't get beyond 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm We don't benefit significantly, though, when the guest is not fully loading a host CPU. Message-Id: <1439220437-23957-10-git-send-email-fred.kon...@greensocs.com> Signed-off-by: KONRAD Frederic [AJB: -smp single-threaded fix, rm old info from commit msg] Signed-off-by: Alex Bennée --- v1 (ajb): - SMP failure now fixed by previous commit Changes from Fred Konrad (mttcg-v7 via paolo): * Rebase on the current HEAD. * Fixes a deadlock in qemu_devices_reset(). * Remove the mutex in address_space_* --- cpu-exec.c | 10 ++ cpus.c | 26 +++--- cputlb.c | 4 exec.c | 12 hw/i386/kvmvapic.c | 3 +++ softmmu_template.h | 17 + translate-all.c| 11 +-- 7 files changed, 58 insertions(+), 25 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 3572256..76891fd 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -28,6 +28,7 @@ #include "qemu/rcu.h" #include "exec/tb-hash.h" #include "exec/log.h" +#include "qemu/main-loop.h" #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) #include "hw/i386/apic.h" #endif @@ -449,6 +450,13 @@ int cpu_exec(CPUState *cpu) for(;;) { interrupt_request = cpu->interrupt_request; if (unlikely(interrupt_request)) { +/* FIXME: this needs to take the iothread lock. + * For this we need to find all places in + * cc->cpu_exec_interrupt that can call cpu_loop_exit, + * and call qemu_unlock_iothread_mutex() there. Else, + * add a flag telling cpu_loop_exit() to unlock it. + */ + if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) { /* Mask out external interrupts for this step. */ interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK; @@ -501,6 +509,7 @@ int cpu_exec(CPUState *cpu) the program flow was changed */ next_tb = 0; } + } if (unlikely(cpu->exit_request || replay_has_interrupt())) { @@ -618,6 +627,7 @@ int cpu_exec(CPUState *cpu) g_assert(env == &x86_cpu->env); #endif #endif /* buggy compiler */ + cpu->can_do_io = 1; tb_lock_reset(); } diff --git a/cpus.c b/cpus.c index a87fbf9..0e3d5f9 100644 --- a/cpus.c +++ b/cpus.c @@ -911,8 +911,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu) #endif /* _WIN32 */ static QemuMutex qemu_global_mutex; -static QemuCond qemu_io_proceeded_cond; -static unsigned iothread_requesting_mutex; static QemuThread io_thread; @@ -928,7 +926,6 @@ void qemu_init_cpu_loop(void) qemu_cond_init(&qemu_cpu_cond); qemu_cond_init(&qemu_pause_cond); qemu_cond_init(&qemu_work_cond); -qemu_cond_init(&qemu_io_proceeded_cond); qemu_mutex_init(&qemu_global_mutex); qemu_thread_get_self(&io_thread); @@ -1043,10 +1040,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); } -while (iothread_requesting_mutex) { -qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex); -} - CPU_FOREACH(cpu) { qemu_wait_io_event_common(cpu); } @@ -1314,22 +1307,7 @@ bool qemu_mutex_iothread_locked(void) void qemu_mutex_lock_iothread(void) { -atomic_inc(&iothread_requesting_mutex); -/* In the s
[Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
From: KONRAD Frederic This finally allows TCG to benefit from the iothread introduction: Drop the global mutex while running pure TCG CPU code. Reacquire the lock when entering MMIO or PIO emulation, or when leaving the TCG loop. We have to revert a few optimization for the current TCG threading model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not kicking it in qemu_cpu_kick. We also need to disable RAM block reordering until we have a more efficient locking mechanism at hand. Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. These numbers demonstrate where we gain something: 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm The guest CPU was fully loaded, but the iothread could still run mostly independent on a second core. Without the patch we don't get beyond 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm We don't benefit significantly, though, when the guest is not fully loading a host CPU. Message-Id: <1439220437-23957-10-git-send-email-fred.kon...@greensocs.com> Signed-off-by: KONRAD Frederic [AJB: -smp single-threaded fix, rm old info from commit msg] Signed-off-by: Alex Bennée --- v1 (ajb): - SMP failure now fixed by previous commit Changes from Fred Konrad (mttcg-v7 via paolo): * Rebase on the current HEAD. * Fixes a deadlock in qemu_devices_reset(). * Remove the mutex in address_space_* --- cpu-exec.c | 10 ++ cpus.c | 26 +++--- cputlb.c | 4 exec.c | 12 hw/i386/kvmvapic.c | 3 +++ softmmu_template.h | 17 + translate-all.c| 11 +-- 7 files changed, 58 insertions(+), 25 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 3572256..76891fd 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -28,6 +28,7 @@ #include "qemu/rcu.h" #include "exec/tb-hash.h" #include "exec/log.h" +#include "qemu/main-loop.h" #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) #include "hw/i386/apic.h" #endif @@ -449,6 +450,13 @@ int cpu_exec(CPUState *cpu) for(;;) { interrupt_request = cpu->interrupt_request; if (unlikely(interrupt_request)) { +/* FIXME: this needs to take the iothread lock. + * For this we need to find all places in + * cc->cpu_exec_interrupt that can call cpu_loop_exit, + * and call qemu_unlock_iothread_mutex() there. Else, + * add a flag telling cpu_loop_exit() to unlock it. + */ + if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) { /* Mask out external interrupts for this step. */ interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK; @@ -501,6 +509,7 @@ int cpu_exec(CPUState *cpu) the program flow was changed */ next_tb = 0; } + } if (unlikely(cpu->exit_request || replay_has_interrupt())) { @@ -618,6 +627,7 @@ int cpu_exec(CPUState *cpu) g_assert(env == &x86_cpu->env); #endif #endif /* buggy compiler */ + cpu->can_do_io = 1; tb_lock_reset(); } diff --git a/cpus.c b/cpus.c index a87fbf9..0e3d5f9 100644 --- a/cpus.c +++ b/cpus.c @@ -911,8 +911,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu) #endif /* _WIN32 */ static QemuMutex qemu_global_mutex; -static QemuCond qemu_io_proceeded_cond; -static unsigned iothread_requesting_mutex; static QemuThread io_thread; @@ -928,7 +926,6 @@ void qemu_init_cpu_loop(void) qemu_cond_init(&qemu_cpu_cond); qemu_cond_init(&qemu_pause_cond); qemu_cond_init(&qemu_work_cond); -qemu_cond_init(&qemu_io_proceeded_cond); qemu_mutex_init(&qemu_global_mutex); qemu_thread_get_self(&io_thread); @@ -1043,10 +1040,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); } -while (iothread_requesting_mutex) { -qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex); -} - CPU_FOREACH(cpu) { qemu_wait_io_event_common(cpu); } @@ -1314,22 +1307,7 @@ bool qemu_mutex_iothread_locked(void) void qemu_mutex_lock_iothread(void) { -atomic_inc(&iothread_requesting_mutex); -/* In the simple case there is no need to bump the VCPU thread out of - * TCG code execution. - */ -if (!tcg_enabled() || qemu_in_vcpu_thread() || -!first_cpu || !first_cpu->created) { -qemu_mutex_lock(&qemu_global_mutex); -atomic_dec(&iothread_requesting_mutex); -} else { -if (qemu_mutex_trylock(&qemu_global_mu
Re: [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
On 18/03/2016 17:18, Alex Bennée wrote: > From: KONRAD Frederic > > This finally allows TCG to benefit from the iothread introduction: Drop > the global mutex while running pure TCG CPU code. Reacquire the lock > when entering MMIO or PIO emulation, or when leaving the TCG loop. > > We have to revert a few optimization for the current TCG threading > model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not > kicking it in qemu_cpu_kick. We also need to disable RAM block > reordering until we have a more efficient locking mechanism at hand. > > Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. > These numbers demonstrate where we gain something: > > 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm > 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm > > The guest CPU was fully loaded, but the iothread could still run mostly > independent on a second core. Without the patch we don't get beyond > > 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm > 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm > > We don't benefit significantly, though, when the guest is not fully > loading a host CPU. > > Message-Id: <1439220437-23957-10-git-send-email-fred.kon...@greensocs.com> > Signed-off-by: KONRAD Frederic > [AJB: -smp single-threaded fix, rm old info from commit msg] > Signed-off-by: Alex Bennée You should probably squash this and patch 10 together. Paolo