On Thu, Jun 2, 2016 at 5:18 PM, Richard Henderson <r...@twiddle.net> wrote: > > Hum. That does seem helpful-ish. But I'm not certain how helpful it is to > complicate the helper functions even further. > > What if we have tcg_canonicalize_memop (or some such) split off the barriers > into separate opcodes. E.g. > > MO_BAR_LD_B = 32 // prevent earlier loads from crossing current op > MO_BAR_ST_B = 64 // prevent earlier stores from crossing current op > MO_BAR_LD_A = 128 // prevent later loads from crossing current op > MO_BAR_ST_A = 256 // prevent later stores from crossing current op > MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B > MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A > MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A > > // Match Sparc MEMBAR as the most flexible host. > TCG_BAR_LD_LD = 1 // #LoadLoad barrier > TCG_BAR_ST_LD = 2 // #StoreLoad barrier > TCG_BAR_LD_ST = 4 // #LoadStore barrier > TCG_BAR_ST_ST = 8 // #StoreStore barrier > TCG_BAR_SYNC = 64 // SEQ_CST barrier
I really like this format. I would also like to add to the frontend: MO_BAR_ACQUIRE MO_BAR_RELEASE and the following to the backend: TCG_BAR_ACQUIRE TCG_BAR_RELEASE since these are one-way barriers and the previous barrier types do not cover them. > > where > > tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER) > > emits > > mb TCG_BAR_LD_LD > qemu_ld_i32 x, y, i, m > mb TCG_BAR_LD_ST > > We can then add an optimization pass which folds barriers with no memory > operations in between, so that duplicates are eliminated. > Yes, folding/eliding these barriers in an optimization pass sounds like a good idea. Thanks, -- Pranith