Hi Thomas, > -----Original Message----- > From: Thomas Monjalon <tho...@monjalon.net> > Sent: Wednesday, May 18, 2022 5:12 AM > To: Ding, Xuan <xuan.d...@intel.com>; Wang, YuanX > <yuanx.w...@intel.com>; Wu, WenxuanX <wenxuanx...@intel.com> > Cc: andrew.rybche...@oktetlabs.ru; Li, Xiaoyun <xiaoyun...@intel.com>; > ferruh.yi...@xilinx.com; Singh, Aman Deep <aman.deep.si...@intel.com>; > dev@dpdk.org; Zhang, Yuying <yuying.zh...@intel.com>; Zhang, Qi Z > <qi.z.zh...@intel.com>; jerinjac...@gmail.com; > step...@networkplumber.org; m...@smartsharesystems.com; > viachesl...@nvidia.com; Yu, Ping <ping...@intel.com> > Subject: Re: [PATCH v5 1/4] lib/ethdev: introduce protocol type based buffer > split > > Hello, > > It seems you didn't try to address my main comment on v4: > " > Before doing anything, the first patch of this series should make the current > status clearer. > Example, this line does not explain what it does: > uint16_t split_hdr_size; /**< hdr buf size (header_split enabled).*/ > And > header_split has been removed in ab3ce1e0c193 ("ethdev: remove old > offload API") > > If RTE_ETH_RX_OFFLOAD_HEADER_SPLIT is not needed, let's add a comment > to start a deprecation. > "
Thank you for the detailed review. First of all, I agree that header split should be deprecated. Since it is unrelated with buffer split, I was planning to send the deprecation notice in 22.07 sometime later and start the deprecation in 22.11. If you think it should be the first step, I will send the deprecation notice first. > > Also the comment from Andrew about removing limitation to 2 packets is not > addressed. Secondly, it is said that the protocol based buffer split will divide the packet into two segments. Because I thought it will only be used in the split between header and payload. In fact, protocol based buffer split can support multi-segment split. That is to say, like length-based buffer split, we define a series of protos, as the split point of protocol based buffer split. And this series of protos, like lengths, indicate the split location. For example, a packet consists of MAC/IPV4/UDP/payload. If we define the buffer split proto with IPV4, and UDP, the packet will be split into three segments: seg0: MAC and IPV4 header, 34 bytes seg1: UDP header, 8 bytes seg2: Payload, the actual payload size What do you think of this design? > > All the part about the protocols capability is missing here. Yes, I missed the protocols capability with RTE_PTYPE* now. I will update the doc with supported protocol capability in v6. > > It is not encouraging. > > 26/04/2022 13:13, wenxuanx...@intel.com: > > From: Wenxuan Wu <wenxuanx...@intel.com> > > > > Protocol based buffer split consists of splitting a received packet > > into two separate regions based on its content. The split happens > > after the packet protocol header and before the packet payload. > > Splitting is usually between the packet protocol header that can be > > posted to a dedicated buffer and the packet payload that can be posted to > a different buffer. > > > > Currently, Rx buffer split supports length and offset based packet split. > > protocol split is based on buffer split, configuring length of buffer > > split is not suitable for NICs that do split based on protocol types. > > Why? Is it impossible to support length split on Intel NIC? Yes, our NIC supports split based on protocol types. And I think there are other vendors too. The existence of tunneling results in the composition of a packet is various. Given a changeable length, it is impossible to tell the driver a fixed protocol type. > > > Because tunneling makes the conversion from length to protocol type > > impossible. > > This is not a HW issue. > I agree on the need but that a different usage than length split. I think the new usage can solve this problem, so that length split and proto split can have the same result. > > > This patch extends the current buffer split to support protocol and > > offset based buffer split. A new proto field is introduced in the > > rte_eth_rxseg_split structure reserved field to specify header > > protocol type. With Rx queue offload > RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT > > enabled and corresponding protocol type configured. PMD will split the > ingress packets into two separate regions. > > Currently, both inner and outer L2/L3/L4 level protocol based buffer > > split can be supported. > > > > For example, let's suppose we configured the Rx queue with the > > following segments: > > seg0 - pool0, off0=2B > > seg1 - pool1, off1=128B > > > > With protocol split type configured with RTE_PTYPE_L4_UDP. The packet > > consists of MAC_IP_UDP_PAYLOAD will be splitted like following: > > seg0 - udp header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 > > seg1 - payload @ 128 in mbuf from pool1 > > Not clear what is the calculation. The previous usage of protocol based split is the split between header and payload. Here for a packet composed of MAC_IP_UDP_PAYLOAD, with protocol split type RTE_PTYPE_L4_UDP configured, it means split between the UDP header and payload. In length configuration, the proto = RTE_PTYPE_L4_UDP means length = 42B. > > > The memory attributes for the split parts may differ either - for > > example the mempool0 and mempool1 belong to dpdk memory and > external > > memory, respectively. > > > > Signed-off-by: Xuan Ding <xuan.d...@intel.com> > > Signed-off-by: Yuan Wang <yuanx.w...@intel.com> > > Signed-off-by: Wenxuan Wu <wenxuanx...@intel.com> > > Reviewed-by: Qi Zhang <qi.z.zh...@intel.com> > > --- > > lib/ethdev/rte_ethdev.c | 36 +++++++++++++++++++++++++++++------- > > lib/ethdev/rte_ethdev.h | 15 ++++++++++++++- > > 2 files changed, 43 insertions(+), 8 deletions(-) > > > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index > > 29a3d80466..1a2bc172ab 100644 > > --- a/lib/ethdev/rte_ethdev.c > > +++ b/lib/ethdev/rte_ethdev.c > > @@ -1661,6 +1661,7 @@ rte_eth_rx_queue_check_split(const struct > rte_eth_rxseg_split *rx_seg, > > struct rte_mempool *mpl = rx_seg[seg_idx].mp; > > uint32_t length = rx_seg[seg_idx].length; > > uint32_t offset = rx_seg[seg_idx].offset; > > + uint32_t proto = rx_seg[seg_idx].proto; > > > > if (mpl == NULL) { > > RTE_ETHDEV_LOG(ERR, "null mempool pointer\n"); > @@ -1694,13 > > +1695,34 @@ rte_eth_rx_queue_check_split(const struct > rte_eth_rxseg_split *rx_seg, > > } > > offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM; > > *mbp_buf_size = rte_pktmbuf_data_room_size(mpl); > > - length = length != 0 ? length : *mbp_buf_size; > > - if (*mbp_buf_size < length + offset) { > > - RTE_ETHDEV_LOG(ERR, > > - "%s mbuf_data_room_size %u < %u > (segment length=%u + segment offset=%u)\n", > > - mpl->name, *mbp_buf_size, > > - length + offset, length, offset); > > - return -EINVAL; > > + if (proto == 0) { > > Add a comment here, /* split at fixed length */ Thanks for the suggestion, will add it in next version. > > > + length = length != 0 ? length : *mbp_buf_size; > > + if (*mbp_buf_size < length + offset) { > > + RTE_ETHDEV_LOG(ERR, > > + "%s mbuf_data_room_size %u < %u > (segment length=%u + segment offset=%u)\n", > > + mpl->name, *mbp_buf_size, > > + length + offset, length, offset); > > + return -EINVAL; > > + } > > + } else { > > Add a comment here, /* split after specified protocol header */ Thanks for the suggestion, will add it in next version. > > > + /* Ensure n_seg is 2 in protocol based buffer split. */ > > + if (n_seg != 2) { > > (should be a space, not a tab before brace) Get it. > > Why do you limit the feature to 2 segments only? Please see the new usage explained above. > > > + RTE_ETHDEV_LOG(ERR, "number of buffer > split protocol segments should be 2.\n"); > > + return -EINVAL; > > + } > > + /* Length and protocol are exclusive here, so make > sure length is 0 in protocol > > + based buffer split. */ > > + if (length != 0) { > > + RTE_ETHDEV_LOG(ERR, "segment length > should be set to zero in buffer split\n"); > > + return -EINVAL; > > + } > > + if (*mbp_buf_size < offset) { > > + RTE_ETHDEV_LOG(ERR, > > + "%s > mbuf_data_room_size %u < %u segment offset)\n", > > + mpl->name, *mbp_buf_size, > > + offset); > > + return -EINVAL; > [...] > > + * - The proto in the elements defines the split position of received > > packets. > > + * > > * - If the length in the segment description element is zero > > * the actual buffer size will be deduced from the appropriate > > * memory pool properties. > > @@ -1197,12 +1200,21 @@ struct rte_eth_txmode { > > * - pool from the last valid element > > * - the buffer size from this pool > > * - zero offset > > + * > > + * - Length based buffer split: > > + * - mp, length, offset should be configured. > > + * - The proto should not be configured in length split. Zero default. > > + * > > + * - Protocol based buffer split: > > + * - mp, offset, proto should be configured. > > + * - The length should not be configured in protocol split. Zero > > default. > > What means "Zero default"? > You should just ignore the non relevant field. Yes, you are right, the none relevant field should be just ignored. I will update the doc in v6. > > > struct rte_eth_rxseg_split { > > struct rte_mempool *mp; /**< Memory pool to allocate segment > from. */ > > uint16_t length; /**< Segment data length, configures split point. */ > > uint16_t offset; /**< Data offset from beginning of mbuf data buffer. > */ > > - uint32_t reserved; /**< Reserved field. */ > > How do you manage ABI compatibility? > Was the reserved field initialized to 0 in previous versions? I think we reached an agreement in RFC v1. There is no document for the reserved field in the previous release. And it is always initialized to zero in real cases. And now splitting based on fixed length and protocol header parsing is exclusive, we can ignore the none relevant field. > > > + uint32_t proto; /**< Protocol of buffer split, determines protocol > > +split point. */ > > What are the values for "proto"? Yes, I missed the protocol capability here, will fix it in v6. > > > @@ -1664,6 +1676,7 @@ struct rte_eth_conf { > > RTE_ETH_RX_OFFLOAD_QINQ_STRIP) #define > DEV_RX_OFFLOAD_VLAN > > RTE_DEPRECATED(DEV_RX_OFFLOAD_VLAN) RTE_ETH_RX_OFFLOAD_VLAN > > > > + > > It looks to be an useless change. Thanks for the catch, will fix it in next version. Thanks again for your time. Regards, Xuan > > > /* > > * If new Rx offload capabilities are defined, they also must be > > * mentioned in rte_rx_offload_names in rte_ethdev.c file. > > > > > >