One caveat - a compiler_barrier should be enough when both sides are using strongly-ordered memory operations (as in the case of the rings). Weakly ordered operations will still need fencing.
-Venky -----Original Message----- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Stephen Hemminger Sent: Thursday, March 27, 2014 1:20 PM To: Olivier MATZ Cc: dev at dpdk.org Subject: Re: [dpdk-dev] memory barriers in rte_ring On Thu, 27 Mar 2014 20:47:37 +0100 Olivier MATZ <olivier.matz at 6wind.com> wrote: > Hi Stephen, > > On 03/27/2014 08:06 PM, Stephen Hemminger wrote: > > Long answer: for the multple CPU access ring, it is equivalent to smp_wmb > > and smp_rmb > > in Linux kernel. For x86 where DPDK is used, this can normally be > > replaced by simpler > > compiler barrier. In kernel there is a special flage X86_OOSTORE which is > > only enabled > > for a few special cases, for most cases it is not. When cpu doesnt do out > > of order > > stores, there are no cases where other cpu will see wrong state. > > Thank you for this clarification. > > So, if I understand properly, all usages of rte_*mb() sequencing > memory operations between CPUs could be replaced by a compiler > barrier. On the other hand, if the memory is also accessed by a > device, a memory barrier has to be used. > > Olivier > I think so for the current architecture that DPDK runs on. It might be good to abstract this in some way for eventual users in other environments.