> -----Original Message-----
> From: Guo, Jia <jia....@intel.com>
> Sent: Thursday, September 17, 2020 3:59 PM
> To: Yang, Qiming <qiming.y...@intel.com>; Xing, Beilei
> <beilei.x...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>; Wu, Jingjing
> <jingjing...@intel.com>; Wang, Haiyue <haiyue.w...@intel.com>
> Cc: Zhao1, Wei <wei.zh...@intel.com>; Richardson, Bruce
> <bruce.richard...@intel.com>; dev@dpdk.org; Guo, Jia <jia....@intel.com>;
> Zhang, Helin <helin.zh...@intel.com>; m...@smartsharesystems.com; Yigit,
> Ferruh <ferruh.yi...@intel.com>; step...@networkplumber.org;
> barbe...@kth.se; Han, YingyaX <yingyax....@intel.com>
> Subject: [PATCH v4 4/5] net/ice: fix vector rx burst for ice
> 
> The limitation of burst size in vector rx was removed, since it should 
> retrieve as
> much received packets as possible. And also the scattered receive path should
> use a wrapper function to achieve the goal of burst maximizing. And do some
> code cleaning for vector rx path.
> 
> Bugzilla ID: 516
> Fixes: c68a52b8b38c ("net/ice: support vector SSE in Rx")
> Fixes: ae60d3c9b227 ("net/ice: support Rx AVX2 vector")
> 
> Signed-off-by: Jeff Guo <jia....@intel.com>
> Tested-by: Yingya Han <yingyax....@intel.com>
> ---
>  drivers/net/ice/ice_rxtx.h          |  1 +
>  drivers/net/ice/ice_rxtx_vec_avx2.c | 23 ++++++------
> drivers/net/ice/ice_rxtx_vec_sse.c  | 56 +++++++++++++++++++----------
>  3 files changed, 49 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_rxtx.h b/drivers/net/ice/ice_rxtx.h index
> 2fdcfb7d0..3ef5f300d 100644
> --- a/drivers/net/ice/ice_rxtx.h
> +++ b/drivers/net/ice/ice_rxtx.h
> @@ -35,6 +35,7 @@
>  #define ICE_MAX_RX_BURST            ICE_RXQ_REARM_THRESH
>  #define ICE_TX_MAX_FREE_BUF_SZ      64
>  #define ICE_DESCS_PER_LOOP          4
> +#define ICE_DESCS_PER_LOOP_AVX           8

No need to expose this if no external link, better to keep all avx stuff inside 
avx.c

> 
>  #define ICE_FDIR_PKT_LEN     512
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index be50677c2..843e4f32a 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -29,7 +29,7 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)
>                       __m128i dma_addr0;
> 
>                       dma_addr0 = _mm_setzero_si128();
> -                     for (i = 0; i < ICE_DESCS_PER_LOOP; i++) {
> +                     for (i = 0; i < ICE_DESCS_PER_LOOP_AVX; i++) {
>                               rxep[i].mbuf = &rxq->fake_mbuf;
>                               _mm_store_si128((__m128i *)&rxdp[i].read,
>                                               dma_addr0);
> @@ -132,12 +132,17 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)
>       ICE_PCI_REG_WRITE(rxq->qrx_tail, rx_id);  }
> 
> +/**
> + * vPMD raw receive routine, only accept(nb_pkts >=
> +ICE_DESCS_PER_LOOP_AVX)
> + *
> + * Notice:
> + * - nb_pkts < ICE_DESCS_PER_LOOP_AVX, just return no packet
> + * - floor align nb_pkts to a ICE_DESCS_PER_LOOP_AVX power-of-two  */

The comment is misleading, it looks like we are going to floor align nb_pkts to 
2^8, better to reword .

>  static inline uint16_t
>  _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct rte_mbuf
> **rx_pkts,
>                           uint16_t nb_pkts, uint8_t *split_packet)  { -#define
> ICE_DESCS_PER_LOOP_AVX 8
> -
>       const uint32_t *ptype_tbl = rxq->vsi->adapter->ptype_tbl;
>       const __m256i mbuf_init = _mm256_set_epi64x(0, 0,
>                       0, rxq->mbuf_initializer);
> @@ -603,10 +608,6 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue
> *rxq, struct rte_mbuf **rx_pkts,
>       return received;
>  }
> 
> -/*
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
> - */
>  uint16_t
>  ice_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
>                      uint16_t nb_pkts)
> @@ -616,8 +617,6 @@ ice_recv_pkts_vec_avx2(void *rx_queue, struct
> rte_mbuf **rx_pkts,
> 
>  /**
>   * vPMD receive routine that reassembles single burst of 32 scattered
> packets
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
>   */

Why we need to remove this? is it still true for this function?

>  static uint16_t
>  ice_recv_scattered_burst_vec_avx2(void *rx_queue, struct rte_mbuf
> **rx_pkts, @@ -626,6 +625,9 @@ ice_recv_scattered_burst_vec_avx2(void
> *rx_queue, struct rte_mbuf **rx_pkts,
>       struct ice_rx_queue *rxq = rx_queue;
>       uint8_t split_flags[ICE_VPMD_RX_BURST] = {0};
> 
> +     /* split_flags only can support max of ICE_VPMD_RX_BURST */
> +     nb_pkts = RTE_MIN(nb_pkts, ICE_VPMD_RX_BURST);

Is this necessary?  the only consumer of this function is 
ice_recv_scattered_pkts_vec_avx2, 
I think nb_pkts <= ICE_VPMD_RX_BURST it already be guaranteed.
> +
>       /* get some new buffers */
>       uint16_t nb_bufs = _ice_recv_raw_pkts_vec_avx2(rxq, rx_pkts, nb_pkts,
>                                                      split_flags);
> @@ -657,9 +659,6 @@ ice_recv_scattered_burst_vec_avx2(void *rx_queue,
> struct rte_mbuf **rx_pkts,
> 
>  /**
>   * vPMD receive routine that reassembles scattered packets.
> - * Main receive routine that can handle arbitrary burst sizes
> - * Notice:
> - * - nb_pkts < ICE_DESCS_PER_LOOP, just return no packet
>   */

Why we need to remove this? isn't it the main routine that be able to handle 
arbitrary burst size?

Btw, I will suggest all AVX2 changes can be in a separate patch, because this 
looks like some code clean and fix.
its not related with the main purpose of the patch set.

Reply via email to