RE: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability

Hanumanth Reddy Pothula Tue, 13 Sep 2022 03:02:46 -0700


> -----Original Message-----
> From: Ferruh Yigit <ferruh.yi...@xilinx.com>
> Sent: Tuesday, September 13, 2022 2:59 PM
> To: Hanumanth Reddy Pothula <hpoth...@marvell.com>; Ding, Xuan
> <xuan.d...@intel.com>; Thomas Monjalon <tho...@monjalon.net>; Andrew
> Rybchenko <andrew.rybche...@oktetlabs.ru>
> Cc: dev@dpdk.org; Wu, WenxuanX <wenxuanx...@intel.com>; Li, Xiaoyun
> <xiaoyun...@intel.com>; step...@networkplumber.org; Wang, YuanX
> <yuanx.w...@intel.com>; m...@ashroe.eu; Zhang, Yuying
> <yuying.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
> viachesl...@nvidia.com; Jerin Jacob Kollanukkaran <jer...@marvell.com>;
> Nithin Kumar Dabilpuram <ndabilpu...@marvell.com>
> Subject: Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability
> 
> On 9/7/2022 10:31 PM, Hanumanth Reddy Pothula wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yi...@xilinx.com>
> >> Sent: Wednesday, September 7, 2022 4:54 PM
> >> To: Hanumanth Reddy Pothula <hpoth...@marvell.com>; Ding, Xuan
> >> <xuan.d...@intel.com>; Thomas Monjalon <tho...@monjalon.net>;
> Andrew
> >> Rybchenko <andrew.rybche...@oktetlabs.ru>
> >> Cc: dev@dpdk.org; Wu, WenxuanX <wenxuanx...@intel.com>; Li, Xiaoyun
> >> <xiaoyun...@intel.com>; step...@networkplumber.org; Wang, YuanX
> >> <yuanx.w...@intel.com>; m...@ashroe.eu; Zhang, Yuying
> >> <yuying.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
> >> viachesl...@nvidia.com; Jerin Jacob Kollanukkaran
> >> <jer...@marvell.com>; Nithin Kumar Dabilpuram
> >> <ndabilpu...@marvell.com>
> >> Subject: Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort
> >> capability
> >>
> >> On 9/7/2022 8:02 AM, Hanumanth Reddy Pothula wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Ferruh Yigit <ferruh.yi...@xilinx.com>
> >>>> Sent: Tuesday, September 6, 2022 5:48 PM
> >>>> To: Hanumanth Reddy Pothula <hpoth...@marvell.com>; Ding, Xuan
> >>>> <xuan.d...@intel.com>; Thomas Monjalon <tho...@monjalon.net>;
> >> Andrew
> >>>> Rybchenko <andrew.rybche...@oktetlabs.ru>
> >>>> Cc: dev@dpdk.org; Wu, WenxuanX <wenxuanx...@intel.com>; Li,
> Xiaoyun
> >>>> <xiaoyun...@intel.com>; step...@networkplumber.org; Wang, YuanX
> >>>> <yuanx.w...@intel.com>; m...@ashroe.eu; Zhang, Yuying
> >>>> <yuying.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
> >>>> viachesl...@nvidia.com; Jerin Jacob Kollanukkaran
> >>>> <jer...@marvell.com>; Nithin Kumar Dabilpuram
> >>>> <ndabilpu...@marvell.com>
> >>>> Subject: Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort
> >>>> capability
> >>>>
> >>>> On 8/30/2022 1:08 PM, Hanumanth Reddy Pothula wrote:
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Ferruh Yigit <ferruh.yi...@xilinx.com>
> >>>>>> Sent: Wednesday, August 24, 2022 9:04 PM
> >>>>>> To: Ding, Xuan <xuan.d...@intel.com>; Hanumanth Reddy Pothula
> >>>>>> <hpoth...@marvell.com>; Thomas Monjalon <tho...@monjalon.net>;
> >>>> Andrew
> >>>>>> Rybchenko <andrew.rybche...@oktetlabs.ru>
> >>>>>> Cc: dev@dpdk.org; Wu, WenxuanX <wenxuanx...@intel.com>; Li,
> >> Xiaoyun
> >>>>>> <xiaoyun...@intel.com>; step...@networkplumber.org; Wang, YuanX
> >>>>>> <yuanx.w...@intel.com>; m...@ashroe.eu; Zhang, Yuying
> >>>>>> <yuying.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
> >>>>>> viachesl...@nvidia.com; Jerin Jacob Kollanukkaran
> >>>>>> <jer...@marvell.com>; Nithin Kumar Dabilpuram
> >>>>>> <ndabilpu...@marvell.com>
> >>>>>> Subject: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort
> >>>>>> capability
> >>>>>>
> >>>>>> External Email
> >>>>>>
> >>>>>> -----------------------------------------------------------------
> >>>>>> --
> >>>>>> --
> >>>>>> -
> >>>>>
> >>>>>
> >>>>> Thanks Ding Xuan and Ferruh Yigit for reviewing the changes and
> >>>>> for providing
> >>>> your valuable feedback.
> >>>>> Please find responses inline.
> >>>>>
> >>>>>> On 8/23/2022 4:26 AM, Ding, Xuan wrote:
> >>>>>>> Hi Hanumanth,
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Hanumanth Pothula <hpoth...@marvell.com>
> >>>>>>>> Sent: Saturday, August 13, 2022 1:25 AM
> >>>>>>>> To: Thomas Monjalon <tho...@monjalon.net>; Ferruh Yigit
> >>>>>>>> <ferruh.yi...@xilinx.com>; Andrew Rybchenko
> >>>>>>>> <andrew.rybche...@oktetlabs.ru>
> >>>>>>>> Cc: dev@dpdk.org; Ding, Xuan <xuan.d...@intel.com>; Wu,
> >>>>>>>> WenxuanX <wenxuanx...@intel.com>; Li, Xiaoyun
> >>>>>>>> <xiaoyun...@intel.com>; step...@networkplumber.org; Wang,
> YuanX
> >> <yuanx.w...@intel.com>;
> >>>>>>>> m...@ashroe.eu; Zhang, Yuying <yuying.zh...@intel.com>; Zhang,
> >>>>>>>> Qi Z <qi.z.zh...@intel.com>; viachesl...@nvidia.com;
> >>>>>>>> jer...@marvell.com; ndabilpu...@marvell.com; Hanumanth Pothula
> >>>>>>>> <hpoth...@marvell.com>
> >>>>>>>> Subject: [PATCH v2 1/3] ethdev: introduce pool sort capability
> >>>>>>>>
> >>>>>>>> Presently, the 'Buffer Split' feature supports sending multiple
> >>>>>>>> segments of the received packet to PMD, which programs the HW
> >>>>>>>> to receive the packet in segments from different pools.
> >>>>>>>>
> >>>>>>>> This patch extends the feature to support the pool sort capability.
> >>>>>>>> Some of the HW has support for choosing memory pools based on
> >>>>>>>> the packet's size. The pool sort capability allows PMD to
> >>>>>>>> choose a memory pool based on the packet's length.
> >>>>>>>>
> >>>>>>>> This is often useful for saving the memory where the
> >>>>>>>> application can create a different pool to steer the specific
> >>>>>>>> size of the packet, thus enabling effective use of memory.
> >>>>>>>>
> >>>>>>>> For example, let's say HW has a capability of three pools,
> >>>>>>>>      - pool-1 size is 2K
> >>>>>>>>      - pool-2 size is > 2K and < 4K
> >>>>>>>>      - pool-3 size is > 4K
> >>>>>>>> Here,
> >>>>>>>>             pool-1 can accommodate packets with sizes < 2K
> >>>>>>>>             pool-2 can accommodate packets with sizes > 2K and < 4K
> >>>>>>>>             pool-3 can accommodate packets with sizes > 4K
> >>>>>>>>
> >>>>>>>> With pool sort capability enabled in SW, an application may
> >>>>>>>> create three pools of different sizes and send them to PMD.
> >>>>>>>> Allowing PMD to program HW based on packet lengths. So that
> >>>>>>>> packets with less than 2K are received on pool-1, packets with
> >>>>>>>> lengths between 2K and 4K are received on pool-2 and finally
> >>>>>>>> packets greater than 4K are received on pool-
> >>>>>> 3.
> >>>>>>>>
> >>>>>>>> The following two capabilities are added to the
> >>>>>>>> rte_eth_rxseg_capa structure, 1. pool_sort --> tells pool sort
> >>>>>>>> capability
> >> is supported by HW.
> >>>>>>>> 2. max_npool --> max number of pools supported by HW.
> >>>>>>>>
> >>>>>>>> Defined new structure rte_eth_rxseg_sort, to be used only when
> >>>>>>>> pool sort capability is present. If required this may be
> >>>>>>>> extended further to support more configurations.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Hanumanth Pothula <hpoth...@marvell.com>
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>>      - Along with spec changes, uploading testpmd and driver changes.
> >>>>>>>
> >>>>>>> Thanks for CCing. It's an interesting feature.
> >>>>>>>
> >>>>>>> But I have one question here:
> >>>>>>> Buffer split is for split receiving packets into multiple
> >>>>>>> segments, while pool sort supports PMD to put the receiving
> >>>>>>> packets into different pools
> >>>>>> according to packet size.
> >>>>>>> Every packet is still intact.
> >>>>>>>
> >>>>>>> So, at this level, pool sort does not belong to buffer split.
> >>>>>>> And you already use a different function to check pool sort
> >>>>>>> rather than check
> >>>>>> buffer split.
> >>>>>>>
> >>>>>>> Should a new RX offload be introduced? like
> >>>>>> "RTE_ETH_RX_OFFLOAD_POOL_SORT".
> >>>>>>>
> >>>>> Please find my response below.
> >>>>>>
> >>>>>> Hi Hanumanth,
> >>>>>>
> >>>>>> I had the similar concern with the feature. I assume you want to
> >>>>>> benefit from exiting config structure that gets multiple mempool
> >>>>>> as argument, since this feature also needs multiple mempools, but
> >>>>>> the feature is
> >>>> different.
> >>>>>>
> >>>>>> It looks to me wrong to check 'OFFLOAD_BUFFER_SPLIT' offload to
> >>>>>> decide if to receive into multiple mempool or not, which doesn't
> >>>>>> have
> >>>> anything related split.
> >>>>>> Also not sure about using the 'sort' keyword.
> >>>>>> What do you think to introduce new fetaure, instead of extending
> >>>>>> existing split one?
> >>>>>
> >>>>> Actually we thought both BUFFER_SPLIT and POOL_SORT are similar
> >>>>> features where RX pools are configured in certain way and thought
> >>>>> not use up one more RX offload capability, as the existing
> >>>>> software architecture
> >>>> can be extended to support pool_sort capability.
> >>>>> Yes, as part of pool sort, there is no buffer split but pools are
> >>>>> picked based on
> >>>> the buffer length.
> >>>>>
> >>>>> Since you think it's better to use new RX offload for POOL_SORT,
> >>>>> will go ahead
> >>>> and implement the same.
> >>>>>
> >>>>>> This is optimisation, right? To enable us to use less memory for
> >>>>>> the packet buffer, does it qualify to a device offload?
> >>>>>>
> >>>>> Yes, its qualify as a device offload and saves memory.
> >>>>> Marvel NIC has a capability to receive packets on  two different
> >>>>> pools based
> >>>> on its length.
> >>>>> Below explained more on the same.
> >>>>>>
> >>>>>> Also, what is the relation with segmented Rx, how a PMD decide to
> >>>>>> use segmented Rx or bigger mempool? How can application can
> >>>>>> configure
> >> this?
> >>>>>>
> >>>>>> Need to clarify the rules, based on your sample, if a 512 bytes
> >>>>>> packet received, does it have to go pool-1, or can it go to any
> >>>>>> of three
> >> pools?
> >>>>>>
> >>>>> Here, Marvell NIC supports two HW pools, SPB(small packet buffer)
> >>>>> pool and
> >>>> LPB(large packet buffer) pool.
> >>>>> SPB pool can hold up to 4KB
> >>>>> LPB pool can hold anything more than 4KB Smaller packets are
> >>>>> received on SPB pool and larger packets on LPB pool, based on the
> >>>>> RQ
> >> configuration.
> >>>>> Here, in our case HW pools holds whole packet. So if a packet is
> >>>>> divided into segments, lower layer HW going to receive all
> >>>>> segments of the packet and then going to place the whole packet in
> >>>>> SPB/LPB pool, based
> >>>> on the packet length.
> >>>>>
> >>>>
> >>>> If the packet is bigger than 4KB, you have two options,
> >>>> 1- Use multiple chained buffers in SPB
> >>>> 2- Use single LPB buffer
> >>>>
> >>>> As I understand (2) is used in this case, but I think we should
> >>>> clarify how this feature works with 'RTE_ETH_RX_OFFLOAD_SCATTER'
> >>>> offload, if it is requested by user.
> >>>>
> >>>> Or lets say HW has two pools with 1K and 2K sizes, what is expected
> >>>> with 4K packet, with or without scattered Rx offload?
> >>>>
> >>>
> >>> As mentioned, Marvell supports two pools, pool-1(SPB) and
> >>> pool-2(LPB) If the packet length is within pool-1 length and has
> >>> only one segment then the
> >> packet is allocated from pool-1.
> >>> If the packet length is greater than pool-1 or has more than one
> >>> segment then
> >> the packet is allocated from pool-2.
> >>>
> >>> So, here packets with a single segment and length less than 1K are
> >>> allocated from pool-1 and packets with multiple segments or packets
> >>> with
> >> length greater than 1K are allocated from pool-2.
> >>>
> >>
> >> To have multiple segment or not is HW configuration, it is not external
> variable.
> >> Drivers mostly decide to configure HW to receive multiple segment or
> >> not based on buffer size and max packet size device support.
> >> In this case since buffer size is not fixed, there are multiple
> >> buffer sizes, how driver will configure HW?
> >>
> >> This is not specific to Marvell HW, for the case multiple mempool
> >> supported, it is better to clarify in this patch how it is works with
> >> 'RTE_ETH_RX_OFFLOAD_SCATTER' offload.
> >>
> >
> > Here, application sends multiple pools with different buffer lengths to PMD
> and PMD further programs HW depending on its architecture.
> > Similarly, in this case, where multiple HW pools are present,  with
> 'RTE_ETH_RX_OFFLOAD_SCATTER' enabled, PMD receives packets/segments
> based on the HW architecture.
> > Depending the architecture, if any extra programming is required(either to
> implement some logic or HW programming) in RX path, its to be implemented in
> that NIC PMD.
> 
> My intention is to clarify the relation between features, like Andrew 
> suggested
> to have Rx segmentation on any pools including mixture of pools, that is an
> option, only please document this in the API documentation so that it is 
> clear for
> future PMD/app developers.


Sure, will document this in the API documentation.
 
> 
> > As far as multiple pool support is considered, in Marvell case, once HW 
> > pools
> are programmed in control path, there is nothing to be done on fast path.
> >
> 
> Out of curiosity, how this is transparent in fast path. Normally mbuf will 
> point to
> a data buffer, if there are multiple pools, how this is not effected?
> Is there any redirection, like mbuf buffer points to some kind of metedata 
> (event
> etc..)?
> 
> > So, I think, It depends on HW architecture of how multiple pool and multiple
> segment receive is implemented.
> >
> > Please suggest, if I am missing any generic scenario(use case) here.
> >
> >>>>> As pools are picked based on the packets length we used SORT term.
> >>>>> In case
> >>>> you have any better term(word), please suggest.
> >>>>>
> >>>>
> >>>> what about multiple pool, like RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL, I
> >>>> think it is more clear but I would like to get more comments from
> >>>> others, naming is hard ;)
> >>>>
> >>> Yes, RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL is clearer than
> >> RTE_ETH_RX_OFFLOAD_SORT_POOL.
> >>> Thanks for the suggestion.
> >>> Will upload V4 with RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL.
> >>>
> >>>>>>
> >>>>>> And I don't see any change in the 'net/cnxk' Rx burst code, when
> >>>>>> multiple mempool used, while filling the mbufs shouldn't it check
> >>>>>> which mempool is filled. How this works without update in the Rx
> >>>>>> burst code, or am I missing some implementation detail?
> >>>>>>
> >>>>> Please find PMD changes in patch [v2,3/3] net/cnxk: introduce pool
> >>>>> sort capability Here, in control path, HW pools are programmed
> >>>>> based on the
> >>>> inputs it received from the application.
> >>>>> Once the HW is programmed, packets are received on HW pools based
> >>>>> the
> >>>> packets sizes.
> >>>>
> >>>> I was expecting to changes in datapath too, something like in Rx
> >>>> burst function check if spb or lpb is used and update mbuf pointers
> >> accordingly.
> >>>> But it seems HW doesn't work this way, can you please explain how
> >>>> this feature works transparent to datapath code?
> >>>>
> >>>>>>
> >>>>>
> >>>>> I will upload V3 where POOL_SORT is implemented as new RX OFFLOAD,
> >>>>> unless
> >>>> If you have any other suggestion/thoughts.
> >>>>>
> >>>>
> >>>> <...>
> >>>
> >

RE: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability

Reply via email to