> x86 has compiler barrier inside the relaxed() API so that code does not
> get reordered. ARM64 architecturally guarantees device writes to be observed
> in order.

There are places where you don't even need a compile barrier between
every write.

I had horrid problems getting some ppc code (for a specific embedded SoC)
optimised to have no extra barriers.
I ended up just writing through 'pointer to volatile' and adding an
explicit 'eieio' between the block of writes and status read.

No less painful was doing a byteswapping write to normal memory.

        David

Reply via email to