Pranith Kumar <bobby.pr...@gmail.com> writes:

> This commit introduces the TCGOpcode for memory barrier instruction.
>
> This opcode takes an argument which is the type of memory barrier
> which should be generated.
>
> Signed-off-by: Pranith Kumar <bobby.pr...@gmail.com>
> Signed-off-by: Richard Henderson <r...@twiddle.net>
> ---
>  tcg/README    | 17 +++++++++++++++++
>  tcg/tcg-op.c  | 11 +++++++++++
>  tcg/tcg-op.h  |  2 ++
>  tcg/tcg-opc.h |  2 ++
>  tcg/tcg.h     | 14 ++++++++++++++
>  5 files changed, 46 insertions(+)
>
> diff --git a/tcg/README b/tcg/README
> index ce8beba..1d48aa9 100644
> --- a/tcg/README
> +++ b/tcg/README
> @@ -402,6 +402,23 @@ double-word product T0.  The later is returned in two 
> single-word outputs.
>
>  Similar to mulu2, except the two inputs T1 and T2 are signed.
>
> +********* Memory Barrier support
> +
> +* mb <$arg>
> +
> +Generate a target memory barrier instruction to ensure memory ordering as 
> being
> +enforced by a corresponding guest memory barrier instruction. The ordering
> +enforced by the backend may be stricter than the ordering required by the 
> guest.
> +It cannot be weaker. This opcode takes a constant argument which is required 
> to
> +generate the appropriate barrier instruction. The backend should take care to
> +emit the target barrier instruction only when necessary i.e., for SMP guests 
> and
> +when MTTCG is enabled.
> +
> +The guest translators should generate this opcode for all guest instructions
> +which have ordering side effects.
> +
> +Please see docs/atomics.txt for more information on memory barriers.
> +
>  ********* 64-bit guest on 32-bit host support
>
>  The following opcodes are internal to TCG.  Thus they are to be implemented 
> by
> diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
> index 54c0277..08f7858 100644
> --- a/tcg/tcg-op.c
> +++ b/tcg/tcg-op.c
> @@ -39,6 +39,8 @@ extern TCGv_i32 TCGV_HIGH_link_error(TCGv_i64);
>  #define TCGV_HIGH TCGV_HIGH_link_error
>  #endif
>
> +extern int smp_cpus;
> +
>  /* Note that this is optimized for sequential allocation during translate.
>     Up to and including filling in the forward link immediately.  We'll do
>     proper termination of the end of the list after we finish translation.  */
> @@ -146,6 +148,15 @@ void tcg_gen_op6(TCGContext *ctx, TCGOpcode opc, TCGArg 
> a1, TCGArg a2,
>      tcg_emit_op(ctx, opc, pi);
>  }
>
> +void tcg_gen_mb(TCGArg mb_type)
> +{
> +#ifndef CONFIG_USER_ONLY
> +    if (qemu_tcg_mttcg_enabled() && smp_cpus > 1) {
> +        tcg_gen_op1(&tcg_ctx, INDEX_op_mb, mb_type);
> +    }
> +#endif
> +}
> +

This introduces a dependency on MTTCG which maybe is best handled in
that patch series. I get the feeling we might be able to merge this
series before MTTCG as barriers are still required for user-mode. Maybe
something like this makes more sense to keep the patch series
independent for now:

    void tcg_gen_mb(TCGArg mb_type)
    {
    #ifdef CONFIG_USER_ONLY
        const bool emit_barriers = true;
    #else
        /* TODO: When MTTCG is available for system mode
         * we will enable when qemu_tcg_mttcg_enabled() && smp_cpus > 1
         */
        bool emit_barriers = false;
    #endif
        if (emit_barriers) {
            tcg_gen_op1(&tcg_ctx, INDEX_op_mb, mb_type);
        }
    }

Maybe it is time to look at the user-mode memory model testers like:

  http://diy.inria.fr/

Rich,

What do you think? Do you think the memory barrier stuff should be kept
independent so it can be merged once ready?

>  /* 32 bit ops */
>
>  void tcg_gen_addi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
> diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
> index f217e80..41890cc 100644
> --- a/tcg/tcg-op.h
> +++ b/tcg/tcg-op.h
> @@ -261,6 +261,8 @@ static inline void tcg_gen_br(TCGLabel *l)
>      tcg_gen_op1(&tcg_ctx, INDEX_op_br, label_arg(l));
>  }
>
> +void tcg_gen_mb(TCGArg a);
> +
>  /* Helper calls. */
>
>  /* 32 bit ops */
> diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
> index 6d0410c..45528d2 100644
> --- a/tcg/tcg-opc.h
> +++ b/tcg/tcg-opc.h
> @@ -42,6 +42,8 @@ DEF(br, 0, 0, 1, TCG_OPF_BB_END)
>  # define IMPL64  TCG_OPF_64BIT
>  #endif
>
> +DEF(mb, 0, 0, 1, 0)
> +
>  DEF(mov_i32, 1, 1, 0, TCG_OPF_NOT_PRESENT)
>  DEF(movi_i32, 1, 0, 1, TCG_OPF_NOT_PRESENT)
>  DEF(setcond_i32, 1, 2, 1, 0)
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index db6a062..36feca9 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL 
> GET_TCGV_PTR(TCGv_ptr t)
>  #define TCG_CALL_DUMMY_TCGV     MAKE_TCGV_I32(-1)
>  #define TCG_CALL_DUMMY_ARG      ((TCGArg)(-1))
>
> +typedef enum {
> +    TCG_MO_LD_LD    = 1,
> +    TCG_MO_ST_LD    = 2,
> +    TCG_MO_LD_ST    = 4,
> +    TCG_MO_ST_ST    = 8,
> +    TCG_MO_ALL      = 0xF, // OR of all above
> +} TCGOrder;
> +
> +typedef enum {
> +    TCG_BAR_ACQ     = 32,
> +    TCG_BAR_REL     = 64,
> +    TCG_BAR_SC      = 128,
> +} TCGBar;

I mentioned my confusing in another email about having separate enums here.

> +
>  /* Conditions.  Note that these are laid out for easy manipulation by
>     the functions below:
>       bit 0 is used for inverting;


--
Alex Bennée

Reply via email to