On 7 Feb 2014 15:45, "Sean Kelly" <s...@invisibleduck.org> wrote: > > On Friday, 7 February 2014 at 11:17:49 UTC, Stanislav Blinov wrote: >> >> On Friday, 7 February 2014 at 08:10:58 UTC, Sean Kelly wrote: >>> >>> Weird. atomicLoad(raw) should be the same as atomicLoad(acq), and atomicStore(raw) should be the same as atomicStore(rel). At least on x86. I don't know why that change made a difference in performance. >> >> >> huh? >> >> --8<-- core/atomic.d >> >> template needsLoadBarrier( MemoryOrder ms ) >> { >> enum bool needsLoadBarrier = ms != MemoryOrder.raw; >> } >> >> -->8-- >> >> Didn't you write this? :) > > > Oops. I thought that since Intel has officially defined loads as having acquire semantics, I had eliminated the barrier requirement there. But I guess not. I suppose it's an issue worth discussing. Does anyone know offhand what C++0x implementations do for load acquires on x86?
Speaking of which, I need to add 'Update gcc.atomics to use new C++0x intrinsics' to the GDCProjects page - they map closely to what core.atomic is doing, and should see better performance compared to the __sync intrinsics. :)