-----Original Message-----
> Date: Thu, 20 Sep 2018 15:20:53 +0000
> From: Honnappa Nagarahalli <[email protected]>
> To: Jerin Jacob <[email protected]>, "Phil Yang (Arm
> Technology China)" <[email protected]>
> CC: "[email protected]" <[email protected]>, nd <[email protected]>,
> "[email protected]" <[email protected]>,
> "Gavin Hu (Arm Technology China)" <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: RE: [PATCH v2 2/3] kni: fix kni fifo synchronization
>
>
> > -----Original Message-----
> > > Date: Wed, 19 Sep 2018 21:42:39 +0800
> > > From: Phil Yang <[email protected]>
> > > To: [email protected]
> > > CC: [email protected], [email protected],
> > > [email protected], [email protected],
> > > [email protected]
> > > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> > > X-Mailer: git-send-email 2.7.4
> > >
> >
> > + Ferruh Yigit <[email protected]>
> >
> > >
> > > With existing code in kni_fifo_put, rx_q values are not being updated
> > > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > > This is causing the sync issue on other core. The same situation
> > > happens in kni_fifo_get as well.
> > >
> > > So syncing the values by adding C11 atomic memory barriers to make
> > > sure the values being synced before updating fifo_write and fifo_read.
> > >
> > > Fixes: 3fc5ca2 ("kni: initial import")
> > > Signed-off-by: Phil Yang <[email protected]>
> > > Reviewed-by: Honnappa Nagarahalli <[email protected]>
> > > Reviewed-by: Gavin Hu <[email protected]>
> > > ---
> > > .../linuxapp/eal/include/exec-env/rte_kni_common.h | 5 ++++
> > > lib/librte_kni/rte_kni_fifo.h | 30
> > > +++++++++++++++++++++-
> > > 2 files changed, 34 insertions(+), 1 deletion(-)
> > >
> > > diff --git
> > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > index cfa9448..1fd713b 100644
> > > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > @@ -54,8 +54,13 @@ struct rte_kni_request {
> > > * Writing should never overwrite the read position
> > > */
> > > struct rte_kni_fifo {
> > > +#ifndef RTE_USE_C11_MEM_MODEL
> > > volatile unsigned write; /**< Next position to be written*/
> > > volatile unsigned read; /**< Next position to be read */
> > > +#else
> > > + unsigned write; /**< Next position to be written*/
> > > + unsigned read; /**< Next position to be read */
> > > +#endif
> > > unsigned len; /**< Circular buffer length */
> > > unsigned elem_size; /**< Pointer size - for 32/64 bit OS
> > > */
> > > void *volatile buffer[]; /**< The buffer contains mbuf
> > > pointers */
> > > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > > --- a/lib/librte_kni/rte_kni_fifo.h
> > > +++ b/lib/librte_kni/rte_kni_fifo.h
> > > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num) {
> > > unsigned i = 0;
> > > unsigned fifo_write = fifo->write;
> > > - unsigned fifo_read = fifo->read;
> > > unsigned new_write = fifo_write;
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > + unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > + __ATOMIC_ACQUIRE);
> > > +#else
> > > + unsigned fifo_read = fifo->read; #endif
> >
> > Correct.
>
> My apologies, did not follow your comment here. Do you want us to correct
> anything here? '#endif' is not appearing on the correct line in the email,
> but it shows up fine on the patch work.
No. What I meant is, code is correct.
>
> >
> >
> > >
> > > for (i = 0; i < num; i++) {
> > > new_write = (new_write + 1) & (fifo->len - 1); @@
> > > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data,
> > unsigned num)
> > > fifo->buffer[fifo_write] = data[i];
> > > fifo_write = new_write;
> > > }
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > + __atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> > > +#else
> > > + rte_smp_wmb();
> > > fifo->write = fifo_write;
> > > +#endif
> >
> > Correct.
> > > return i;
> > > }
> > >
> > > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num) {
> > > unsigned i = 0;
> > > unsigned new_read = fifo->read;
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > + unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > +__ATOMIC_ACQUIRE); #else
> > > unsigned fifo_write = fifo->write;
> > > +#endif
> >
> > Correct.
> >
> > > +
> > > for (i = 0; i < num; i++) {
> > > if (new_read == fifo_write)
> > > break;
> > > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data,
> > unsigned num)
> > > data[i] = fifo->buffer[new_read];
> > > new_read = (new_read + 1) & (fifo->len - 1);
> > > }
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > + __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> > > +#else
> > > + rte_smp_wmb();
> > > fifo->read = new_read;
> > > +#endif
> >
> > Correct.
> >
> > > return i;
> > > }
> > >
> > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num) static inline uint32_t kni_fifo_count(struct
> > > rte_kni_fifo *fifo) {
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > + unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > + __ATOMIC_ACQUIRE);
> > > + unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > + __ATOMIC_ACQUIRE);
> >
> > Isn't too heavy to have two __ATOMIC_ACQUIREs? a simple rte_smp_rmb()
> > would be enough here. Right?
> > or
> > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> >
> We also had some amount of debate internally on this:
> 1) We do not want to use rte_smp_rmb() as we want to keep the memory models
> separated (for ex: while using C11, use C11 everywhere). It is also not
> sufficient, please see 3) below.
But Nothing technically wrong in using rte_smp_rmb() here in terms
functionally and code generated by the compiler.
> 2) This API can get called from writer or reader, so both the loads have to
> be __ATOMIC_ACQUIRE
> 3) Other option is to use __ATOMIC_RELAXED. That would allow any loads/stores
> around of this API to get reordered, especially since this is an inline
> function. This would put burden on the application to manage the ordering
> depending on its usage. It will also require the application to understand
> the implementation of this API.
__ATOMIC_RELAXED may be fine too for _count() case as it may not very
important to get the exact count for the exact very moment, Application can
retry.
I am in favor of performance effective implementation.
>
> >
> > Other than that, I prefer to avoid ifdef clutter by introducing two
> > separate file
> > just like ring C11 implementation.
> >
> > I don't have strong opinion on this this part, I let KNI MAINTAINER to
> > decide
> > on how to accommodate this change.
>
> I prefer to change this as well, I am open for suggestions.
> Introducing two separate files would be too much for this library. A better
> way would be to have something similar to 'smp_store_release' provided by the
> kernel. i.e. create #defines for loads/stores. Hide the clutter behind the
> #defines.
No Strong opinion on this, leaving to KNI Maintainer.
This patch needs to split by two,
a) Fixes for non C11 implementation(i.e new addition to rte_smp_wmb())
b) add support for C11 implementation.
>
> >
> > > + return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
> > > +#else
> > > return (fifo->len + fifo->write - fifo->read) & (fifo->len -
> > > 1);
> > > +#endif
> > > }
> > > --
> > > 2.7.4
> > >