On Tue, 24 Sep 2019 11:29:25 PDT (-0700), alistai...@gmail.com wrote:
On Mon, Jun 24, 2019 at 11:21 AM Joel Sing <j...@sing.id.au> wrote:
On 19-06-17 16:52:44, Richard Henderson wrote:
> On 6/16/19 12:19 PM, Joel Sing wrote:
> > + /*
> > + * Clear the load reservation, since an SC must fail if there is
> > + * an SC to any address, in between an LR and SC pair.
> > + */
> > + tcg_gen_movi_tl(load_res, 0);
> > +
> > gen_set_label(l2);
>
> This clear needs to be moved down below label l2.
> Otherwise, with lr / sc / sc, the second sc could succeed in error.
Indeed, thanks.
> FWIW, other targets have used -1 as the "invalid" load reservation, since the
> architecture does not require address 0 to be unmapped. This should be quite
> visible in M-mode with paging disabled and ram at offset 0. Often, other
> targets require alignment for the lr/sc address, though I don't see that for
riscv.
I've switched to -1 as suggested. Regarding the alignment for reservations, the
specification does require this, although I do not recall seeing any enforcement
of this by qemu itself.
New diff follows.
From 8ef31a2ce8ef1cbeee92995a0b2994f480e9bb6d Mon Sep 17 00:00:00 2001
From: Joel Sing <j...@sing.id.au>
Date: Tue, 25 Jun 2019 02:44:24 +1000
Subject: [PATCH] Clear load reservations on qemu riscv target
This prevents a load reservation from being placed in one context/process,
then being used in another, resulting in an SC succeeding incorrectly and
breaking atomics.
Signed-off-by: Joel Sing <j...@sing.id.au>
---
target/riscv/cpu.c | 1 +
target/riscv/cpu_helper.c | 9 +++++++++
target/riscv/insn_trans/trans_rva.inc.c | 8 +++++++-
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d61bce6d55..e7c8bf48fc 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -281,6 +281,7 @@ static void riscv_cpu_reset(CPUState *cs)
env->pc = env->resetvec;
#endif
cs->exception_index = EXCP_NONE;
+ env->load_res = -1;
set_default_nan_mode(1, &env->fp_status);
}
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index b17f169681..6a07b12e65 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -113,6 +113,15 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong
newpriv)
}
/* tlb_flush is unnecessary as mode is contained in mmu_idx */
env->priv = newpriv;
+
+ /* Clear the load reservation - otherwise a reservation placed in one
+ * context/process can be used by another, resulting in an SC succeeding
+ * incorrectly. Version 2.2 of the ISA specification explicitly requires
+ * this behaviour, while later revisions say that the kernel "should" use
+ * an SC instruction to force the yielding of a load reservation on a
+ * preemptive context switch. As a result, do both.
+ */
+ env->load_res = -1;
}
/* get_physical_address - get the physical address for this virtual address
diff --git a/target/riscv/insn_trans/trans_rva.inc.c
b/target/riscv/insn_trans/trans_rva.inc.c
index f6dbbc065e..fadd88849e 100644
--- a/target/riscv/insn_trans/trans_rva.inc.c
+++ b/target/riscv/insn_trans/trans_rva.inc.c
@@ -61,7 +61,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a,
TCGMemOp mop)
gen_set_label(l1);
/*
- * Address comparion failure. However, we still need to
+ * Address comparison failure. However, we still need to
* provide the memory barrier implied by AQ/RL.
*/
tcg_gen_mb(TCG_MO_ALL + a->aq * TCG_BAR_LDAQ + a->rl * TCG_BAR_STRL);
@@ -69,6 +69,12 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a,
TCGMemOp mop)
gen_set_gpr(a->rd, dat);
gen_set_label(l2);
+ /*
+ * Clear the load reservation, since an SC must fail if there is
+ * an SC to any address, in between an LR and SC pair.
+ */
+ tcg_gen_movi_tl(load_res, -1);
+
tcg_temp_free(dat);
tcg_temp_free(src1);
tcg_temp_free(src2);
--
This patch causes boot failures when booting systemd built with musl on RV64.
It could possibly be a musl bug, but I wanted to throw that out here
first to see what people think.
Looking at the musl port, I see at least one bug in their atomics jumping out
at me:
diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h
index c9765342..41ad4d04 100644
--- a/arch/riscv64/atomic_arch.h
+++ b/arch/riscv64/atomic_arch.h
@@ -14,7 +14,7 @@ static inline int a_cas(volatile int *p, int t, int s)
" sc.w.aqrl %1, %4, (%2)\n"
" bnez %1, 1b\n"
"1:"
- : "=&r"(old), "=r"(tmp)
+ : "=&r"(old), "=&r"(tmp)
: "r"(p), "r"(t), "r"(s)
: "memory");
return old;
@@ -31,7 +31,7 @@ static inline void *a_cas_p(volatile void *p, void *t, void
*s)
" sc.d.aqrl %1, %4, (%2)\n"
" bnez %1, 1b\n"
"1:"
- : "=&r"(old), "=r"(tmp)
+ : "=&r"(old), "=&r"(tmp)
: "r"(p), "r"(t), "r"(s)
: "memory");
return old;
It's a shot in the dark as to whether that'll fix your bug, but I could at
least see a mechanism for it: before we yielded load reservations on context
switches then that backwards branch would never be taken, so we wouldn't notice
if tmp was allocated into one of the same registers as the inputs. Even if it
doesn't fix your issue it's still a bug so I'll send the patch out, just LMK so
I can indicate how important the issue is.
This should manifest on hardware, but it looks like we managed to drop that SC
patch. I'll go send the patch out now...
Alistair
2.21.0