Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)
What race are you thinking of in my patch? The obvious race I can think of is benign: Case 1: A: does TB flush B: read tb_flush_count A: increment tb_flush_count A: end_exclusive B: tb_lookup__cpu_state/tb_gen_code B: start_exclusive B: read tb_flush_count again (increment seen) B: retries Case 2: B: read tb_flush_count A: does TB flush A: increment tb_flush_count A: end_exclusive B: tb_lookup__cpu_state/tb_gen_code B: start_exclusive B: read tb_flush_count again (increment seen) B: retries Case 3: A: does TB flush A: increment tb_flush_count A: end_exclusive B: read tb_flush_count B: tb_lookup__cpu_state/tb_gen_code B: start_exclusive B: read tb_flush_count again (no increment seen) B: proceeds Case 1 is the expected case. Case 2, we thought TB was stale but it wasn't so we get it again with tb_lookup__cpu_state with minimal extra overhead. Case 3 seems to be bad because we could read tb_flush_count and find it already incremented. But if so that means thread A is at the end of do_tb_flush and the lookup tables are already cleared and the TCG context is already reset. So it should be safe for thread B to call tb_lookup__cpu_state or tb_gen_code. Yifan On Fri, Feb 14, 2020 at 3:31 PM Richard Henderson wrote: > > On 2/14/20 6:49 AM, Alex Bennée wrote: > > The bug describes a race whereby cpu_exec_step_atomic can acquire a TB > > which is invalidated by a tb_flush before we execute it. This doesn't > > affect the other cpu_exec modes as a tb_flush by it's nature can only > > occur on a quiescent system. The race was described as: > > > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > > B3. tcg_tb_alloc obtains a new TB > > > > C3. TB obtained with tb_lookup__cpu_state or tb_gen_code > > (same TB as B2) > > > > A3. start_exclusive critical section entered > > A4. do_tb_flush is called, TB memory freed/re-allocated > > A5. end_exclusive exits critical section > > > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > > B3. tcg_tb_alloc reallocates TB from B2 > > > > C4. start_exclusive critical section entered > > C5. cpu_tb_exec executes the TB code that was free in A4 > > > > The simplest fix is to widen the exclusive period to include the TB > > lookup. As a result we can drop the complication of checking we are in > > the exclusive region before we end it. > > I'm not 100% keen on having the tb_gen_code within the exclusive region. It > implies a much larger delay on (at least) the first execution of the atomic > operation. > > But I suppose until recently we had a global lock around code generation, and > this is only slightly worse. Plus, it has the advantage of being dead simple, > and without the races vs tb_ctx.tb_flush_count that exist in Yifan's patch. > > Applied to tcg-next. > > > r~
Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)
On 2/14/20 6:49 AM, Alex Bennée wrote: > The bug describes a race whereby cpu_exec_step_atomic can acquire a TB > which is invalidated by a tb_flush before we execute it. This doesn't > affect the other cpu_exec modes as a tb_flush by it's nature can only > occur on a quiescent system. The race was described as: > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc obtains a new TB > > C3. TB obtained with tb_lookup__cpu_state or tb_gen_code > (same TB as B2) > > A3. start_exclusive critical section entered > A4. do_tb_flush is called, TB memory freed/re-allocated > A5. end_exclusive exits critical section > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc reallocates TB from B2 > > C4. start_exclusive critical section entered > C5. cpu_tb_exec executes the TB code that was free in A4 > > The simplest fix is to widen the exclusive period to include the TB > lookup. As a result we can drop the complication of checking we are in > the exclusive region before we end it. I'm not 100% keen on having the tb_gen_code within the exclusive region. It implies a much larger delay on (at least) the first execution of the atomic operation. But I suppose until recently we had a global lock around code generation, and this is only slightly worse. Plus, it has the advantage of being dead simple, and without the races vs tb_ctx.tb_flush_count that exist in Yifan's patch. Applied to tcg-next. r~
Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)
On 14/02/20 15:49, Alex Bennée wrote: > The bug describes a race whereby cpu_exec_step_atomic can acquire a TB > which is invalidated by a tb_flush before we execute it. This doesn't > affect the other cpu_exec modes as a tb_flush by it's nature can only > occur on a quiescent system. The race was described as: > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc obtains a new TB > > C3. TB obtained with tb_lookup__cpu_state or tb_gen_code > (same TB as B2) > > A3. start_exclusive critical section entered > A4. do_tb_flush is called, TB memory freed/re-allocated > A5. end_exclusive exits critical section > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc reallocates TB from B2 > > C4. start_exclusive critical section entered > C5. cpu_tb_exec executes the TB code that was free in A4 > > The simplest fix is to widen the exclusive period to include the TB > lookup. As a result we can drop the complication of checking we are in > the exclusive region before we end it. Reviewed-by: Paolo Bonzini > Signed-off-by: Alex Bennée > Cc: Yifan > Cc: Bug 1863025 <1863...@bugs.launchpad.net> > --- > accel/tcg/cpu-exec.c | 21 +++-- > 1 file changed, 11 insertions(+), 10 deletions(-) > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 2560c90eec7..d95c4848a47 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -240,6 +240,8 @@ void cpu_exec_step_atomic(CPUState *cpu) > uint32_t cf_mask = cflags & CF_HASH_MASK; > > if (sigsetjmp(cpu->jmp_env, 0) == 0) { > +start_exclusive(); > + > tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask); > if (tb == NULL) { > mmap_lock(); > @@ -247,8 +249,6 @@ void cpu_exec_step_atomic(CPUState *cpu) > mmap_unlock(); > } > > -start_exclusive(); > - > /* Since we got here, we know that parallel_cpus must be true. */ > parallel_cpus = false; > cc->cpu_exec_enter(cpu); > @@ -271,14 +271,15 @@ void cpu_exec_step_atomic(CPUState *cpu) > qemu_plugin_disable_mem_helpers(cpu); > } > > -if (cpu_in_exclusive_context(cpu)) { > -/* We might longjump out of either the codegen or the > - * execution, so must make sure we only end the exclusive > - * region if we started it. > - */ > -parallel_cpus = true; > -end_exclusive(); > -} > + > +/* > + * As we start the exclusive region before codegen we must still > + * be in the region if we longjump out of either the codegen or > + * the execution. > + */ > +g_assert(cpu_in_exclusive_context(cpu)); > +parallel_cpus = true; > +end_exclusive(); > } > > struct tb_desc { >
[PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)
The bug describes a race whereby cpu_exec_step_atomic can acquire a TB which is invalidated by a tb_flush before we execute it. This doesn't affect the other cpu_exec modes as a tb_flush by it's nature can only occur on a quiescent system. The race was described as: B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc obtains a new TB C3. TB obtained with tb_lookup__cpu_state or tb_gen_code (same TB as B2) A3. start_exclusive critical section entered A4. do_tb_flush is called, TB memory freed/re-allocated A5. end_exclusive exits critical section B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc reallocates TB from B2 C4. start_exclusive critical section entered C5. cpu_tb_exec executes the TB code that was free in A4 The simplest fix is to widen the exclusive period to include the TB lookup. As a result we can drop the complication of checking we are in the exclusive region before we end it. Signed-off-by: Alex Bennée Cc: Yifan Cc: Bug 1863025 <1863...@bugs.launchpad.net> --- accel/tcg/cpu-exec.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 2560c90eec7..d95c4848a47 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -240,6 +240,8 @@ void cpu_exec_step_atomic(CPUState *cpu) uint32_t cf_mask = cflags & CF_HASH_MASK; if (sigsetjmp(cpu->jmp_env, 0) == 0) { +start_exclusive(); + tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask); if (tb == NULL) { mmap_lock(); @@ -247,8 +249,6 @@ void cpu_exec_step_atomic(CPUState *cpu) mmap_unlock(); } -start_exclusive(); - /* Since we got here, we know that parallel_cpus must be true. */ parallel_cpus = false; cc->cpu_exec_enter(cpu); @@ -271,14 +271,15 @@ void cpu_exec_step_atomic(CPUState *cpu) qemu_plugin_disable_mem_helpers(cpu); } -if (cpu_in_exclusive_context(cpu)) { -/* We might longjump out of either the codegen or the - * execution, so must make sure we only end the exclusive - * region if we started it. - */ -parallel_cpus = true; -end_exclusive(); -} + +/* + * As we start the exclusive region before codegen we must still + * be in the region if we longjump out of either the codegen or + * the execution. + */ +g_assert(cpu_in_exclusive_context(cpu)); +parallel_cpus = true; +end_exclusive(); } struct tb_desc { -- 2.20.1