> -----Original Message----- > From: Jerin Jacob Kollanukkaran [mailto:[email protected]] > Sent: Monday, January 28, 2019 7:34 AM > To: [email protected]; Maciej Czekaj <[email protected]>; Eads, Gage > <[email protected]>; [email protected] > Cc: [email protected]; [email protected]; [email protected]; > Richardson, Bruce <[email protected]>; [email protected]; > Ananyev, Konstantin <[email protected]> > Subject: Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation > > On Fri, 2019-01-25 at 17:21 +0000, Eads, Gage wrote: > > > -----Original Message----- > > > From: Ola Liljedahl [mailto:[email protected]] > > > Sent: Wednesday, January 23, 2019 4:16 AM > > > To: Eads, Gage <[email protected]>; [email protected] > > > Cc: [email protected]; [email protected]; nd > > > <[email protected]>; Richardson, Bruce <[email protected]>; > > > [email protected]; Ananyev, Konstantin > > > <[email protected]> > > > Subject: Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking > > > implementation > > > > > > s. > > > > > > > > > You can tell this code was written when I thought x86-64 was the > > > > only viable target :). Yes, you are correct. > > > > > > > > With regards to using __atomic intrinsics, I'm planning on taking > > > > a similar approach to the functions duplicated in > > > > rte_ring_generic.h and > > > > rte_ring_c11_mem.h: one version that uses rte_atomic functions > > > > (and thus stricter memory ordering) and one that uses __atomic > > > > intrinsics (and thus can benefit from more relaxed memory > > > > ordering). > > > What's the advantage of having two different implementations? What > > > is the disadvantage? > > > > > > The existing ring buffer code originally had only the "legacy" > > > implementation > > > which was kept when the __atomic implementation was added. The > > > reason claimed was that some older compilers for x86 do not support > > > GCC __atomic builtins. But I thought there was consensus that new > > > functionality could have only __atomic implementations. > > > > > > > When CONFIG_RTE_RING_USE_C11_MEM_MODEL was introduced, it was left > > disabled for thunderx[1] for performance reasons. Assuming that hasn't > > changed, the advantage to having two versions is to best support all > > of DPDK's platforms. The disadvantage is of course duplicated code and > > the additional maintenance burden. > > > > That said, if the thunderx maintainers are ok with it, I'm certainly > > The ring code was so fundamental building block for DPDK, there was difference > in performance and there was already legacy code so introducing > C11_MEM_MODEL was justified IMO. > > For the nonblocking implementation, I am happy to test with three ARM64 > microarchitectures and share the result with C11_MEM_MODEL vs non > C11_MEM_MODLE performance. We may need to consider PPC also here. So > IMO, based on the overall performance result may be can decide the new code > direction.
Appreciate the help. Please hold off any testing until we've had a chance to incorporate ideas from lfring, which will definitely affect performance.

