ping
On Sat, Aug 26, 2017 at 9:52 AM, Bill Fischofer <bill.fischo...@linaro.org> wrote: > On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks <brian.bro...@arm.com> wrote: > >> Memory accesses that happen-before, in program order, a call to >> odp_barrier_wait() cannot be reordered to after the call. Similarly, >> memory accesses that happen-after, in program order, a call to >> odp_barrier_wait() cannot be reordered to before the call. >> >> The current implementation of barriers uses sequentially consistent >> fences on either side of odp_barrier_wait(). >> >> The correct memory ordering for barriers is release upon entering >> odp_barrier_wait(), to prevent reordering to after the barrier, and >> acquire upon exiting odp_barrier_wait(), to prevent reordering to >> before the barrier. >> >> The measurable performance difference is negligible on weakly ordered >> architectures such as ARM, so the highlight of this change is correctness. >> >> Signed-off-by: Brian Brooks <brian.bro...@arm.com> >> > > Reviewed-by: Bill Fischofer <bill.fischo...@linaro.org> > > >> --- >> platform/linux-generic/odp_barrier.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/platform/linux-generic/odp_barrier.c >> b/platform/linux-generic/odp_barrier.c >> index 5eb354de..f70bdbf8 100644 >> --- a/platform/linux-generic/odp_barrier.c >> +++ b/platform/linux-generic/odp_barrier.c >> @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier) >> uint32_t count; >> int wasless; >> >> - odp_mb_full(); >> + odp_mb_release(); >> >> count = odp_atomic_fetch_inc_u32(&barrier->bar); >> wasless = count < barrier->count; >> @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier) >> odp_cpu_pause(); >> } >> >> - odp_mb_full(); >> + odp_mb_acquire(); >> } >> -- >> 2.14.1 >> >>