We detected the out of order packet desc read
issue with GCC6.1. This happened after we apply
some other patch. Actually by adding some dummy
code will also make this happen.
So add compile barrier should be a clean solution.
The performance test show no impact with this
patch. This does make sense because vPMD is already
fine tuned and the sequence we coded is actually
the execute sequence we want.

Signed-off-by: Zhang Qi <qi.z.zhang at intel.com>
---
 drivers/net/i40e/i40e_rxtx_vec.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec.c b/drivers/net/i40e/i40e_rxtx_vec.c
index 51fb282..ce9c7d8 100644
--- a/drivers/net/i40e/i40e_rxtx_vec.c
+++ b/drivers/net/i40e/i40e_rxtx_vec.c
@@ -282,7 +282,7 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
                /* Read desc statuses backwards to avoid race condition */
                /* A.1 load 4 pkts desc */
                descs[3] = _mm_loadu_si128((__m128i *)(rxdp + 3));
-
+               rte_compiler_barrier();
                /* B.2 copy 2 mbuf point into rx_pkts  */
                _mm_storeu_si128((__m128i *)&rx_pkts[pos], mbp1);

@@ -290,8 +290,10 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
                mbp2 = _mm_loadu_si128((__m128i *)&sw_ring[pos+2]);

                descs[2] = _mm_loadu_si128((__m128i *)(rxdp + 2));
+               rte_compiler_barrier();
                /* B.1 load 2 mbuf point */
                descs[1] = _mm_loadu_si128((__m128i *)(rxdp + 1));
+               rte_compiler_barrier();
                descs[0] = _mm_loadu_si128((__m128i *)(rxdp));

                /* B.2 copy 2 mbuf point into rx_pkts  */
-- 
2.7.4

Reply via email to