Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier

Alex Bennée Tue, 21 Jun 2016 11:25:46 -0700

Pranith Kumar <bobby.pr...@gmail.com> writes:

> On Tue, Jun 21, 2016 at 2:04 PM, Alex Bennée <alex.ben...@linaro.org> wrote:
>>
>> Pranith Kumar <bobby.pr...@gmail.com> writes:
>>
>>> This commit introduces the TCGOpcode for memory barrier instruction.
>>>
>>> This opcode takes an argument which is the type of memory barrier
>>> which should be generated.
>>>
>>> Signed-off-by: Pranith Kumar <bobby.pr...@gmail.com>
>>> Signed-off-by: Richard Henderson <r...@twiddle.net>
>>> ---
>>>  tcg/README    | 17 +++++++++++++++++
>>>  tcg/tcg-op.c  | 11 +++++++++++
>>>  tcg/tcg-op.h  |  2 ++
>>>  tcg/tcg-opc.h |  2 ++
>>>  tcg/tcg.h     | 14 ++++++++++++++
>>>  5 files changed, 46 insertions(+)
>>>
>>> diff --git a/tcg/README b/tcg/README
>>> index ce8beba..1d48aa9 100644
>>> --- a/tcg/README
>>> +++ b/tcg/README
>>> @@ -402,6 +402,23 @@ double-word product T0.  The later is returned in two 
>>> single-word outputs.
>>>
>>>  Similar to mulu2, except the two inputs T1 and T2 are signed.
>>>
>>> +********* Memory Barrier support
>>> +
>>> +* mb <$arg>
>>> +
>>> +Generate a target memory barrier instruction to ensure memory ordering as 
>>> being
>>> +enforced by a corresponding guest memory barrier instruction. The ordering
>>> +enforced by the backend may be stricter than the ordering required by the 
>>> guest.
>>> +It cannot be weaker. This opcode takes a constant argument which is 
>>> required to
>>> +generate the appropriate barrier instruction. The backend should take care 
>>> to
>>> +emit the target barrier instruction only when necessary i.e., for SMP 
>>> guests and
>>> +when MTTCG is enabled.
>>> +
>>> +The guest translators should generate this opcode for all guest 
>>> instructions
>>> +which have ordering side effects.
>>> +
>>> +Please see docs/atomics.txt for more information on memory barriers.
>>> +
>>>  ********* 64-bit guest on 32-bit host support
>>>
>>>  The following opcodes are internal to TCG.  Thus they are to be 
>>> implemented by
>>> diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
>>> index 54c0277..08f7858 100644
>>> --- a/tcg/tcg-op.c
>>> +++ b/tcg/tcg-op.c
>>> @@ -39,6 +39,8 @@ extern TCGv_i32 TCGV_HIGH_link_error(TCGv_i64);
>>>  #define TCGV_HIGH TCGV_HIGH_link_error
>>>  #endif
>>>
>>> +extern int smp_cpus;
>>> +
>>>  /* Note that this is optimized for sequential allocation during translate.
>>>     Up to and including filling in the forward link immediately.  We'll do
>>>     proper termination of the end of the list after we finish translation.  
>>> */
>>> @@ -146,6 +148,15 @@ void tcg_gen_op6(TCGContext *ctx, TCGOpcode opc, 
>>> TCGArg a1, TCGArg a2,
>>>      tcg_emit_op(ctx, opc, pi);
>>>  }
>>>
>>> +void tcg_gen_mb(TCGArg mb_type)
>>> +{
>>> +#ifndef CONFIG_USER_ONLY
>>> +    if (qemu_tcg_mttcg_enabled() && smp_cpus > 1) {
>>> +        tcg_gen_op1(&tcg_ctx, INDEX_op_mb, mb_type);
>>> +    }
>>> +#endif
>>> +}
>>> +
>>
>> This introduces a dependency on MTTCG which maybe is best handled in
>> that patch series. I get the feeling we might be able to merge this
>> series before MTTCG as barriers are still required for user-mode. Maybe
>> something like this makes more sense to keep the patch series
>> independent for now:
>>
>>     void tcg_gen_mb(TCGArg mb_type)
>>     {
>>     #ifdef CONFIG_USER_ONLY
>>         const bool emit_barriers = true;
>>     #else
>>         /* TODO: When MTTCG is available for system mode
>>          * we will enable when qemu_tcg_mttcg_enabled() && smp_cpus > 1
>>          */
>>         bool emit_barriers = false;
>>     #endif
>>         if (emit_barriers) {
>>             tcg_gen_op1(&tcg_ctx, INDEX_op_mb, mb_type);
>>         }
>>     }
>
> I was basing my patches on top of your mttcg base tree, since I can
> test these patches only with MTTCG enabled :)
>
> If you think I should rebase on mainine, I will do so with the changes
> you suggested above.


Well I'll be guided by what Richard thinks about the chances of this
merging ahead of the main MTTCG patches. You were mentioning the trouble
with testing with the kvm-unit-tests based tests which all need MTTCG
system mode to trigger problems. However user mode emulation "works" now
(even better since Peter's signal fixes) so it might be worth
investigating if we can leverage some user-mode testing instead.

Also generally keeping patch series clean and self contained stops too
large a pile of patches getting pilled up waiting for the controversial
bits to get reviewed and merged.

>
>>
>> Maybe it is time to look at the user-mode memory model testers like:
>>
>>   http://diy.inria.fr/
>
> Interesting. I will take a look. Thanks for the link.
>
>>
>> Rich,
>>
>> What do you think? Do you think the memory barrier stuff should be kept
>> independent so it can be merged once ready?
>>
>>>  /* 32 bit ops */
>>>
>>>  void tcg_gen_addi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
>>> diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
>>> index f217e80..41890cc 100644
>>> --- a/tcg/tcg-op.h
>>> +++ b/tcg/tcg-op.h
>>> @@ -261,6 +261,8 @@ static inline void tcg_gen_br(TCGLabel *l)
>>>      tcg_gen_op1(&tcg_ctx, INDEX_op_br, label_arg(l));
>>>  }
>>>
>>> +void tcg_gen_mb(TCGArg a);
>>> +
>>>  /* Helper calls. */
>>>
>>>  /* 32 bit ops */
>>> diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
>>> index 6d0410c..45528d2 100644
>>> --- a/tcg/tcg-opc.h
>>> +++ b/tcg/tcg-opc.h
>>> @@ -42,6 +42,8 @@ DEF(br, 0, 0, 1, TCG_OPF_BB_END)
>>>  # define IMPL64  TCG_OPF_64BIT
>>>  #endif
>>>
>>> +DEF(mb, 0, 0, 1, 0)
>>> +
>>>  DEF(mov_i32, 1, 1, 0, TCG_OPF_NOT_PRESENT)
>>>  DEF(movi_i32, 1, 0, 1, TCG_OPF_NOT_PRESENT)
>>>  DEF(setcond_i32, 1, 2, 1, 0)
>>> diff --git a/tcg/tcg.h b/tcg/tcg.h
>>> index db6a062..36feca9 100644
>>> --- a/tcg/tcg.h
>>> +++ b/tcg/tcg.h
>>> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL 
>>> GET_TCGV_PTR(TCGv_ptr t)
>>>  #define TCG_CALL_DUMMY_TCGV     MAKE_TCGV_I32(-1)
>>>  #define TCG_CALL_DUMMY_ARG      ((TCGArg)(-1))
>>>
>>> +typedef enum {
>>> +    TCG_MO_LD_LD    = 1,
>>> +    TCG_MO_ST_LD    = 2,
>>> +    TCG_MO_LD_ST    = 4,
>>> +    TCG_MO_ST_ST    = 8,
>>> +    TCG_MO_ALL      = 0xF, // OR of all above
>>> +} TCGOrder;
>>> +
>>> +typedef enum {
>>> +    TCG_BAR_ACQ     = 32,
>>> +    TCG_BAR_REL     = 64,
>>> +    TCG_BAR_SC      = 128,
>>> +} TCGBar;
>>
>> I mentioned my confusing in another email about having separate enums here.
>
>
> Noted. I will add a comment explaining the split.


--
Alex Bennée

Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier

Reply via email to