Hi Dong, > -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of WangDong > Sent: Tuesday, May 05, 2015 4:38 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA > processor's rte_wmb/rte_rmb. > > The current implementation of rte_wmb/rte_rmb for x86 is using processor > memory barrier. It's unnessary for IA processor, compiler > memory barrier is enough.
I wouldn't say they are 'unnecessary'. There are situations, even on IA, when you need _fence_ isntructions. So, please leave rte_*mb() macros unmodified. I still think that we need to create a new set of architecture dependent macros, as what discussed before. Probably by analogy with linux kernel rte_smp_*mb() is a good name for them. Though if you have some better name in mind, I am open to suggestions here. > But if dpdk runing on a AMD processor, maybe we should use processor memory > barrier. As far as I remember, amd has the same memory ordering model. So, I don't think we need #ifdef RTE_ARCH_X86_IA here. Konstantin > I add a macro to distinguish them, if we compile DPDK for IA processor, add > the macro (RTE_ARCH_X86_IA) can improve performance > with compiler memory barrier. Or we can add RTE_ARCH_X86_AMD for using > processor memory barrier, in this case, if didn't add the > macro, the memory ordering will not be guaranteed. Which macro is better? > If this patch applied, the PMD's old implementation of compiler memory > barrier (some volatile variable) can be fixed with rte_rmb() > and rte_wmb() for any architecture. > > --- > lib/librte_eal/common/include/arch/x86/rte_atomic.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h > b/lib/librte_eal/common/include/arch/x86/rte_atomic.h > index e93e8ee..52b1e81 100644 > --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h > +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h > @@ -49,10 +49,20 @@ extern "C" { > > #define rte_mb() _mm_mfence() > > +#ifdef RTE_ARCH_X86_IA > + > +#define rte_wmb() rte_compiler_barrier() > + > +#define rte_rmb() rte_compiler_barrier() > + > +#else > + > #define rte_wmb() _mm_sfence() > > #define rte_rmb() _mm_lfence() > > +#endif > + > /*------------------------- 16 bit atomic operations > -------------------------*/ > > #ifndef RTE_FORCE_INTRINSICS > -- > 1.9.1