Hi Ruifeng,

Patch looks reasonable, thank you.
Just curious - did you see the real issue with re-ordering in this code 
fragment?
And, please, let us do performance check.

With best regards,
Slava

> -----Original Message-----
> From: Ruifeng Wang <[email protected]>
> Sent: Thursday, February 10, 2022 8:25
> To: Matan Azrad <[email protected]>; Slava Ovsiienko
> <[email protected]>
> Cc: [email protected]; Honnappa Nagarahalli
> <[email protected]>; [email protected]; nd <[email protected]>;
> Ruifeng Wang <[email protected]>; nd <[email protected]>
> Subject: RE: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON vector
> path
> 
> Ping.
> Please could you help to review this patch?
> 
> Thanks.
> Ruifeng
> 
> > -----Original Message-----
> > From: Ruifeng Wang <[email protected]>
> > Sent: Tuesday, January 4, 2022 11:01 AM
> > To: [email protected]; [email protected]
> > Cc: [email protected]; Honnappa Nagarahalli
> <[email protected]>;
> > [email protected]; nd <[email protected]>; Ruifeng Wang
> <[email protected]>
> > Subject: [PATCH] net/mlx5: fix risk in Rx descriptor read in NEON
> > vector path
> >
> > In NEON vector PMD, vector load loads two contiguous 8B of descriptor
> > data into vector register. Given vector load ensures no 16B atomicity,
> > read of the word that includes op_own field could be reordered after
> > read of other words. In this case, some words could contain invalid data.
> >
> > Reloaded qword0 after read barrier to update vector register. This
> > ensures that the fetched data is correct.
> >
> > Testpmd single core test on N1SDP/ThunderX2 showed no performance
> > drop.
> >
> > Fixes: 1742c2d9fab0 ("net/mlx5: fix synchronization on polling Rx
> > completions")
> > Cc: [email protected]
> >
> > Signed-off-by: Ruifeng Wang <[email protected]>
> > ---
> >  drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > index b1d16baa61..b1ec615b51 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> > @@ -647,6 +647,14 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq,
> > volatile struct mlx5_cqe *cq,
> >             c0 = vld1q_u64((uint64_t *)(p0 + 48));
> >             /* Synchronize for loading the rest of blocks. */
> >             rte_io_rmb();
> > +           /* B.0 (CQE 3) reload lower half of the block. */
> > +           c3 = vld1q_lane_u64((uint64_t *)(p3 + 48), c3, 0);
> > +           /* B.0 (CQE 2) reload lower half of the block. */
> > +           c2 = vld1q_lane_u64((uint64_t *)(p2 + 48), c2, 0);
> > +           /* B.0 (CQE 1) reload lower half of the block. */
> > +           c1 = vld1q_lane_u64((uint64_t *)(p1 + 48), c1, 0);
> > +           /* B.0 (CQE 0) reload lower half of the block. */
> > +           c0 = vld1q_lane_u64((uint64_t *)(p0 + 48), c0, 0);
> >             /* Prefetch next 4 CQEs. */
> >             if (pkts_n - pos >= 2 * MLX5_VPMD_DESCS_PER_LOOP) {
> >                     unsigned int next = pos +
> > MLX5_VPMD_DESCS_PER_LOOP;
> > --
> > 2.25.1

Reply via email to