ping

On Sat, Aug 26, 2017 at 9:52 AM, Bill Fischofer
<bill.fischo...@linaro.org> wrote:
> On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks <brian.bro...@arm.com> wrote:
>
>> Memory accesses that happen-before, in program order, a call to
>> odp_barrier_wait() cannot be reordered to after the call. Similarly,
>> memory accesses that happen-after, in program order, a call to
>> odp_barrier_wait() cannot be reordered to before the call.
>>
>> The current implementation of barriers uses sequentially consistent
>> fences on either side of odp_barrier_wait().
>>
>> The correct memory ordering for barriers is release upon entering
>> odp_barrier_wait(), to prevent reordering to after the barrier, and
>> acquire upon exiting odp_barrier_wait(), to prevent reordering to
>> before the barrier.
>>
>> The measurable performance difference is negligible on weakly ordered
>> architectures such as ARM, so the highlight of this change is correctness.
>>
>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
>>
>
> Reviewed-by: Bill Fischofer <bill.fischo...@linaro.org>
>
>
>> ---
>>  platform/linux-generic/odp_barrier.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/platform/linux-generic/odp_barrier.c
>> b/platform/linux-generic/odp_barrier.c
>> index 5eb354de..f70bdbf8 100644
>> --- a/platform/linux-generic/odp_barrier.c
>> +++ b/platform/linux-generic/odp_barrier.c
>> @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier)
>>         uint32_t count;
>>         int wasless;
>>
>> -       odp_mb_full();
>> +       odp_mb_release();
>>
>>         count   = odp_atomic_fetch_inc_u32(&barrier->bar);
>>         wasless = count < barrier->count;
>> @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier)
>>                         odp_cpu_pause();
>>         }
>>
>> -       odp_mb_full();
>> +       odp_mb_acquire();
>>  }
>> --
>> 2.14.1
>>
>>

Reply via email to