Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)

2020-02-14 Thread Yifan Lu
What race are you thinking of in my patch? The obvious race I can
think of is benign:

Case 1:
A: does TB flush
B: read tb_flush_count
A: increment tb_flush_count
A: end_exclusive
B: tb_lookup__cpu_state/tb_gen_code
B: start_exclusive
B: read tb_flush_count again (increment seen)
B: retries

Case 2:
B: read tb_flush_count
A: does TB flush
A: increment tb_flush_count
A: end_exclusive
B: tb_lookup__cpu_state/tb_gen_code
B: start_exclusive
B: read tb_flush_count again (increment seen)
B: retries

Case 3:
A: does TB flush
A: increment tb_flush_count
A: end_exclusive
B: read tb_flush_count
B: tb_lookup__cpu_state/tb_gen_code
B: start_exclusive
B: read tb_flush_count again (no increment seen)
B: proceeds

Case 1 is the expected case. Case 2, we thought TB was stale but it
wasn't so we get it again with tb_lookup__cpu_state with minimal extra
overhead.

Case 3 seems to be bad because we could read tb_flush_count and find
it already incremented. But if so that means thread A is at the end of
do_tb_flush and the lookup tables are already cleared and the TCG
context is already reset. So it should be safe for thread B to call
tb_lookup__cpu_state or tb_gen_code.

Yifan

On Fri, Feb 14, 2020 at 3:31 PM Richard Henderson
 wrote:
>
> On 2/14/20 6:49 AM, Alex Bennée wrote:
> > The bug describes a race whereby cpu_exec_step_atomic can acquire a TB
> > which is invalidated by a tb_flush before we execute it. This doesn't
> > affect the other cpu_exec modes as a tb_flush by it's nature can only
> > occur on a quiescent system. The race was described as:
> >
> >   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
> >   B3. tcg_tb_alloc obtains a new TB
> >
> >   C3. TB obtained with tb_lookup__cpu_state or tb_gen_code
> >   (same TB as B2)
> >
> >   A3. start_exclusive critical section entered
> >   A4. do_tb_flush is called, TB memory freed/re-allocated
> >   A5. end_exclusive exits critical section
> >
> >   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
> >   B3. tcg_tb_alloc reallocates TB from B2
> >
> >   C4. start_exclusive critical section entered
> >   C5. cpu_tb_exec executes the TB code that was free in A4
> >
> > The simplest fix is to widen the exclusive period to include the TB
> > lookup. As a result we can drop the complication of checking we are in
> > the exclusive region before we end it.
>
> I'm not 100% keen on having the tb_gen_code within the exclusive region.  It
> implies a much larger delay on (at least) the first execution of the atomic
> operation.
>
> But I suppose until recently we had a global lock around code generation, and
> this is only slightly worse.  Plus, it has the advantage of being dead simple,
> and without the races vs tb_ctx.tb_flush_count that exist in Yifan's patch.
>
> Applied to tcg-next.
>
>
> r~



Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)

2020-02-14 Thread Richard Henderson
On 2/14/20 6:49 AM, Alex Bennée wrote:
> The bug describes a race whereby cpu_exec_step_atomic can acquire a TB
> which is invalidated by a tb_flush before we execute it. This doesn't
> affect the other cpu_exec modes as a tb_flush by it's nature can only
> occur on a quiescent system. The race was described as:
> 
>   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
>   B3. tcg_tb_alloc obtains a new TB
> 
>   C3. TB obtained with tb_lookup__cpu_state or tb_gen_code
>   (same TB as B2)
> 
>   A3. start_exclusive critical section entered
>   A4. do_tb_flush is called, TB memory freed/re-allocated
>   A5. end_exclusive exits critical section
> 
>   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
>   B3. tcg_tb_alloc reallocates TB from B2
> 
>   C4. start_exclusive critical section entered
>   C5. cpu_tb_exec executes the TB code that was free in A4
> 
> The simplest fix is to widen the exclusive period to include the TB
> lookup. As a result we can drop the complication of checking we are in
> the exclusive region before we end it.

I'm not 100% keen on having the tb_gen_code within the exclusive region.  It
implies a much larger delay on (at least) the first execution of the atomic
operation.

But I suppose until recently we had a global lock around code generation, and
this is only slightly worse.  Plus, it has the advantage of being dead simple,
and without the races vs tb_ctx.tb_flush_count that exist in Yifan's patch.

Applied to tcg-next.


r~



Re: [PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)

2020-02-14 Thread Paolo Bonzini
On 14/02/20 15:49, Alex Bennée wrote:
> The bug describes a race whereby cpu_exec_step_atomic can acquire a TB
> which is invalidated by a tb_flush before we execute it. This doesn't
> affect the other cpu_exec modes as a tb_flush by it's nature can only
> occur on a quiescent system. The race was described as:
> 
>   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
>   B3. tcg_tb_alloc obtains a new TB
> 
>   C3. TB obtained with tb_lookup__cpu_state or tb_gen_code
>   (same TB as B2)
> 
>   A3. start_exclusive critical section entered
>   A4. do_tb_flush is called, TB memory freed/re-allocated
>   A5. end_exclusive exits critical section
> 
>   B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
>   B3. tcg_tb_alloc reallocates TB from B2
> 
>   C4. start_exclusive critical section entered
>   C5. cpu_tb_exec executes the TB code that was free in A4
> 
> The simplest fix is to widen the exclusive period to include the TB
> lookup. As a result we can drop the complication of checking we are in
> the exclusive region before we end it.

Reviewed-by: Paolo Bonzini 

> Signed-off-by: Alex Bennée 
> Cc: Yifan 
> Cc: Bug 1863025 <1863...@bugs.launchpad.net>
> ---
>  accel/tcg/cpu-exec.c | 21 +++--
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 2560c90eec7..d95c4848a47 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -240,6 +240,8 @@ void cpu_exec_step_atomic(CPUState *cpu)
>  uint32_t cf_mask = cflags & CF_HASH_MASK;
>  
>  if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> +start_exclusive();
> +
>  tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask);
>  if (tb == NULL) {
>  mmap_lock();
> @@ -247,8 +249,6 @@ void cpu_exec_step_atomic(CPUState *cpu)
>  mmap_unlock();
>  }
>  
> -start_exclusive();
> -
>  /* Since we got here, we know that parallel_cpus must be true.  */
>  parallel_cpus = false;
>  cc->cpu_exec_enter(cpu);
> @@ -271,14 +271,15 @@ void cpu_exec_step_atomic(CPUState *cpu)
>  qemu_plugin_disable_mem_helpers(cpu);
>  }
>  
> -if (cpu_in_exclusive_context(cpu)) {
> -/* We might longjump out of either the codegen or the
> - * execution, so must make sure we only end the exclusive
> - * region if we started it.
> - */
> -parallel_cpus = true;
> -end_exclusive();
> -}
> +
> +/*
> + * As we start the exclusive region before codegen we must still
> + * be in the region if we longjump out of either the codegen or
> + * the execution.
> + */
> +g_assert(cpu_in_exclusive_context(cpu));
> +parallel_cpus = true;
> +end_exclusive();
>  }
>  
>  struct tb_desc {
> 




[PATCH] accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025)

2020-02-14 Thread Alex Bennée
The bug describes a race whereby cpu_exec_step_atomic can acquire a TB
which is invalidated by a tb_flush before we execute it. This doesn't
affect the other cpu_exec modes as a tb_flush by it's nature can only
occur on a quiescent system. The race was described as:

  B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
  B3. tcg_tb_alloc obtains a new TB

  C3. TB obtained with tb_lookup__cpu_state or tb_gen_code
  (same TB as B2)

  A3. start_exclusive critical section entered
  A4. do_tb_flush is called, TB memory freed/re-allocated
  A5. end_exclusive exits critical section

  B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code
  B3. tcg_tb_alloc reallocates TB from B2

  C4. start_exclusive critical section entered
  C5. cpu_tb_exec executes the TB code that was free in A4

The simplest fix is to widen the exclusive period to include the TB
lookup. As a result we can drop the complication of checking we are in
the exclusive region before we end it.

Signed-off-by: Alex Bennée 
Cc: Yifan 
Cc: Bug 1863025 <1863...@bugs.launchpad.net>
---
 accel/tcg/cpu-exec.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2560c90eec7..d95c4848a47 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -240,6 +240,8 @@ void cpu_exec_step_atomic(CPUState *cpu)
 uint32_t cf_mask = cflags & CF_HASH_MASK;
 
 if (sigsetjmp(cpu->jmp_env, 0) == 0) {
+start_exclusive();
+
 tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask);
 if (tb == NULL) {
 mmap_lock();
@@ -247,8 +249,6 @@ void cpu_exec_step_atomic(CPUState *cpu)
 mmap_unlock();
 }
 
-start_exclusive();
-
 /* Since we got here, we know that parallel_cpus must be true.  */
 parallel_cpus = false;
 cc->cpu_exec_enter(cpu);
@@ -271,14 +271,15 @@ void cpu_exec_step_atomic(CPUState *cpu)
 qemu_plugin_disable_mem_helpers(cpu);
 }
 
-if (cpu_in_exclusive_context(cpu)) {
-/* We might longjump out of either the codegen or the
- * execution, so must make sure we only end the exclusive
- * region if we started it.
- */
-parallel_cpus = true;
-end_exclusive();
-}
+
+/*
+ * As we start the exclusive region before codegen we must still
+ * be in the region if we longjump out of either the codegen or
+ * the execution.
+ */
+g_assert(cpu_in_exclusive_context(cpu));
+parallel_cpus = true;
+end_exclusive();
 }
 
 struct tb_desc {
-- 
2.20.1