Hi Thomas, I am afraid I introduced unnecessary complexity in the discussion as the spinlock issues I mentioned are connected to a work in progress on my side (implement a Chelsio cxgb5 PMD) but *not* to the general DPDK.
I'll explain some aspects of the context and how critical sections has to be handled in the case of the Chelsio cxgb5 PMD. 1) WC memory Most memory is cached, some is not cached at all, and some is tagged (via paging attributes) to be "Write Combining". This means that writes to WC memory locations do *NOT* go through the cache but rather to a memory "write buffer" internal to the processor. The processor delays write to memory as it please, trying to minimize the actual number of writes to DRAM. This type of memory is used by some devices as a fast interface, for instance Chelsio cxgb5 write queue can filled using this method (http://lwn.net/Articles/542643/). Public DPDK cannot declare such memory type as it requires kernel support. 2) fencing and out of order execution Out of order execution is related to the processor that may actually change the order of the instructions of a program as it has been produced by the compiler. In general, this relates only to performance but for critical sections, it is absolutely necessary to ensure that the processor does NOT postpone critical section instructions AFTER the unlock. So there is a need of a "fence" to tell the processor that some instruction blocks out of order execution: no previous instruction can be postponed after the "fence". 3) Public DPDK critical sections The DPDK implements lock/unlock with xchg instruction which is atomic and serializing for "normal" cached memory. It includes an implicit lock prefix, so no need to add such prefix (your patch proposal). This (implicit) lock prefix is actually acting as an out of order execution fence, thus ensuring that critical section is working as expected. Nothing has to be added or changed for public DPDK to have correct critical sections. 4) "Extended" DPDK critical sections Because of Chelsio cxgb5 PMD driver I am developing, I added kernel support to DPDK to declare some memory regions as Write Combining, hence to be able to leverage the output queue fast path. Because of this, I also had to add proper fencing for critical sections. The relation is the fact that writes to WC memory does NOT go through the cache AND that the lock prefix is implemented as a cache protocol transaction. In other words, a lock prefix is NOT an out of order instruction fence for instructions that deal with WC memory. So, the processor may decide to postpone the WC related instruction ATFER the unlock. To create a proper out of order execution fence, the processor has a set of explicit memory fences that do the job. So before a rte_unlock, I have to add a rte_mb(). So: - for people willing to develop applications on top of DPDK, my comments can be disregarded, they are not relevant, there is no need to understand them. - for people willing to develop PMD drivers, the fencing comments should be clear to use the proper fencing. I am sorry for any confusion I may have introduced in the community. Fran?ois-Fr?d?ric > -----Message d'origine----- > De?: dev [mailto:dev-bounces at dpdk.org] De la part de Thomas Monjalon > Envoy??: samedi 21 d?cembre 2013 00:38 > ??: dev at dpdk.org > Objet?: [dpdk-dev] [PATCH] spinlock: fix atomic and out of order execution > > From: Damien Millescamps <damien.millescamps at 6wind.com> > > Add lock prefix before xchg instructions in order to be atomic and flush > speculative values to ensure effective execution order (as an acquire > barrier). > > MPLOCKED is a "lock" in multicore case. > > Signed-off-by: Damien Millescamps <damien.millescamps at 6wind.com> > Signed-off-by: Thomas Monjalon <thomas.monjalon at 6wind.com> > --- > lib/librte_eal/common/include/rte_spinlock.h | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_spinlock.h > b/lib/librte_eal/common/include/rte_spinlock.h > index f7a245a..8edb971 100644 > --- a/lib/librte_eal/common/include/rte_spinlock.h > +++ b/lib/librte_eal/common/include/rte_spinlock.h > @@ -51,6 +51,7 @@ > extern "C" { > #endif > > +#include <rte_atomic.h> > #include <rte_lcore.h> > #ifdef RTE_FORCE_INTRINSICS > #include <rte_common.h> > @@ -93,7 +94,7 @@ rte_spinlock_lock(rte_spinlock_t *sl) > int lock_val = 1; > asm volatile ( > "1:\n" > - "xchg %[locked], %[lv]\n" > + MPLOCKED "xchg %[locked], %[lv]\n" > "test %[lv], %[lv]\n" > "jz 3f\n" > "2:\n" > @@ -124,7 +125,7 @@ rte_spinlock_unlock (rte_spinlock_t *sl) #ifndef > RTE_FORCE_INTRINSICS > int unlock_val = 0; > asm volatile ( > - "xchg %[locked], %[ulv]\n" > + MPLOCKED "xchg %[locked], %[ulv]\n" > : [locked] "=m" (sl->locked), [ulv] "=q" (unlock_val) > : "[ulv]" (unlock_val) > : "memory"); > @@ -148,7 +149,7 @@ rte_spinlock_trylock (rte_spinlock_t *sl) > int lockval = 1; > > asm volatile ( > - "xchg %[locked], %[lockval]" > + MPLOCKED "xchg %[locked], %[lockval]" > : [locked] "=m" (sl->locked), [lockval] "=q" (lockval) > : "[lockval]" (lockval) > : "memory"); > -- > 1.7.10.4