> -----Original Message----- > From: Slava Ovsiienko <[email protected]> > Sent: Thursday, February 10, 2022 4:17 PM > To: Ruifeng Wang <[email protected]>; Matan Azrad > <[email protected]> > Cc: [email protected]; Honnappa Nagarahalli > <[email protected]>; [email protected]; nd <[email protected]>; > nd <[email protected]> > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector > path > > Hi Ruifeng,
Hi Slava, > > Patch looks reasonable, thank you. > Just curious - did you see the real issue with re-ordering in this code > fragment? No real issue was seen. It is analysis from architecture perspective. > And, please, let us do performance check. Sure. Thank you. > > With best regards, > Slava > > > -----Original Message----- > > From: Ruifeng Wang <[email protected]> > > Sent: Thursday, February 10, 2022 8:25 > > To: Matan Azrad <[email protected]>; Slava Ovsiienko > > <[email protected]> > > Cc: [email protected]; Honnappa Nagarahalli > <[email protected]>; > > [email protected]; nd <[email protected]>; Ruifeng Wang > <[email protected]>; > > nd <[email protected]> > > Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON > > vector path > > > > Ping. > > Please could you help to review this patch? > > > > Thanks. > > Ruifeng > > > > > -----Original Message----- > > > From: Ruifeng Wang <[email protected]> > > > Sent: Tuesday, January 4, 2022 11:01 AM > > > To: [email protected]; [email protected] > > > Cc: [email protected]; Honnappa Nagarahalli > > <[email protected]>; > > > [email protected]; nd <[email protected]>; Ruifeng Wang > > <[email protected]> > > > Subject: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON > > > vector path > > > > > > In NEON vector PMD, vector load loads two contiguous 8B of > > > descriptor data into vector register. Given vector load ensures no > > > 16B atomicity, read of the word that includes op_own field could be > > > reordered after read of other words. In this case, some words could > contain invalid data. > > > > > > Reloaded qword0 after read barrier to update vector register. This > > > ensures that the fetched data is correct. > > > > > > Testpmd single core test on N1SDP/ThunderX2 showed no performance > > > drop. > > > > > > Fixes: 1742c2d9fab0 ("net/mlx5: fix synchronization on polling Rx > > > completions") > > > Cc: [email protected] > > > > > > Signed-off-by: Ruifeng Wang <[email protected]> > > > --- > > > drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > > > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > > > index b1d16baa61..b1ec615b51 100644 > > > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > > > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > > > @@ -647,6 +647,14 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, > > > volatile struct mlx5_cqe *cq, > > > c0 = vld1q_u64((uint64_t *)(p0 + 48)); > > > /* Synchronize for loading the rest of blocks. */ > > > rte_io_rmb(); > > > + /* B.0 (CQE 3) reload lower half of the block. */ > > > + c3 = vld1q_lane_u64((uint64_t *)(p3 + 48), c3, 0); > > > + /* B.0 (CQE 2) reload lower half of the block. */ > > > + c2 = vld1q_lane_u64((uint64_t *)(p2 + 48), c2, 0); > > > + /* B.0 (CQE 1) reload lower half of the block. */ > > > + c1 = vld1q_lane_u64((uint64_t *)(p1 + 48), c1, 0); > > > + /* B.0 (CQE 0) reload lower half of the block. */ > > > + c0 = vld1q_lane_u64((uint64_t *)(p0 + 48), c0, 0); > > > /* Prefetch next 4 CQEs. */ > > > if (pkts_n - pos >= 2 * MLX5_VPMD_DESCS_PER_LOOP) { > > > unsigned int next = pos + > > > MLX5_VPMD_DESCS_PER_LOOP; > > > -- > > > 2.25.1

