On Wed, Jun 03, 2020 at 04:14:13PM +0000, Slava Ovsiienko wrote: > > -----Original Message----- > > From: dev <dev-boun...@dpdk.org> On Behalf Of Nithin Dabilpuram > > Sent: Wednesday, June 3, 2020 15:52 > > To: Olivier Matz <olivier.m...@6wind.com> > > Cc: Ananyev, Konstantin <konstantin.anan...@intel.com>; Jerin Jacob > > <jerinjac...@gmail.com>; Dumitrescu, Cristian > > <cristian.dumitre...@intel.com>; Nithin Dabilpuram > > <nithind1...@gmail.com>; Thomas Monjalon <tho...@monjalon.net>; > > Yigit, Ferruh <ferruh.yi...@intel.com>; Andrew Rybchenko > > <arybche...@solarflare.com>; Ori Kam <or...@mellanox.com>; Burakov, > > Anatoly <anatoly.bura...@intel.com>; Mcnamara, John > > <john.mcnam...@intel.com>; Kovacevic, Marko > > <marko.kovace...@intel.com>; dpdk-dev <dev@dpdk.org>; Jerin Jacob > > <jer...@marvell.com>; Krzysztof Kanas <kka...@marvell.com>; > > techbo...@dpdk.org > > Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for > > packet marking > > > > On Wed, Jun 03, 2020 at 01:38:22PM +0200, Olivier Matz wrote: > > > Hi Nithin, > > > > > > On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote: > > > > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote: > > > > > Hi, > > > > > > > > > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote: > > > > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin > > wrote: > > > > > > > Hi Jerin, > > > > > > > > > > > > > > > > > > I also share Olivier's concern about consuming 3 bits in > > ol_flags for that feature. > > > > > > > > > > > Can it probably be squeezed somehow? > > > > > > > > > > > Let say we reserve one flag that this information is > > > > > > > > > > > present or not, and re-use one of rx-only fields for store > > additional information (packet_type, or so). > > > > > > > > > > > Or might be some other approach. > > > > > > > > > > > > > > > > > > > > We are fine with this approach where we define one bit > > > > > > > > > > in Tx offloads for pkt marking and and 3 bits reused from Rx > > offload flags area. > > > > > > > > > > > > > > > > > > > > For example: > > > > > > > > > > > > > > > > > > > > @@ -186,10 +186,16 @@ extern "C" { > > > > > > > > > > > > > > > > > > > > /* add new RX flags here, don't forget to update > > > > > > > > > > PKT_FIRST_FREE */ > > > > > > > > > > > > > > > > > > > > +/* Reused Rx offload bits for Tx offloads */ > > > > > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI (1ULL << 0) > > > > > > > > > > +#define PKT_X_TX_MARK_IP_DSCP (1ULL << 1) > > > > > > > > > > +#define PKT_X_TX_MARK_IP_ECN (1ULL << 2) > > > > > > > > > > + > > > > > > > > > > #define PKT_FIRST_FREE (1ULL << 23) -#define > > > > > > > > > > PKT_LAST_FREE (1ULL << 40) > > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 39) > > > > > > > > > > > > > > > > > > > > /* add new TX flags here, don't forget to update > > > > > > > > > > PKT_LAST_FREE */ > > > > > > > > > > +#define PKT_TX_MARK_EN (1ULL << 40) > > > > > > > > > > > > > > > > > > > > Is this fine ? > > > > > > > > > > > > > > > > > > Any thoughts on this approach which uses only 1 bit in Tx > > > > > > > > > flags out of 18 and reuse unused Rx flag bits ? > > > > > > > > > > > > > > My thought was not about re-defining the flags (I think it is > > > > > > > better to keep them intact), but adding a union for one of rx-only > > fields (packet_type/rss/timestamp). > > > > > > > > > > > > Ok. Adding a union field at packet_type field is also fine like > > > > > > below. > > > > > > > > > > > > @@ -187,9 +187,10 @@ extern "C" { > > > > > > /* add new RX flags here, don't forget to update PKT_FIRST_FREE > > > > > > */ > > > > > > > > > > > > #define PKT_FIRST_FREE (1ULL << 23) -#define PKT_LAST_FREE > > > > > > (1ULL << 40) > > > > > > +#define PKT_LAST_FREE (1ULL << 39) > > > > > > > > > > > > /* add new TX flags here, don't forget to update PKT_LAST_FREE > > > > > > */ > > > > > > +#define PKT_TX_MARK_EN (1ULL << 40) > > > > > > > > > > > > /** > > > > > > * Outer UDP checksum offload flag. This flag is used for > > > > > > enabling @@ -461,6 +462,14 @@ enum { #endif }; > > > > > > > > > > > > +/* Tx packet marking flags in rte_mbuf::tx_mark. > > > > > > + * Valid only when PKT_TX_MARK_EN is set in > > > > > > + * rte_mbuf::ol_flags. > > > > > > + */ > > > > > > +#define TX_MARK_VLAN_DEI (1ULL << 0) > > > > > > +#define TX_MARK_IP_DSCP (1ULL << 1) > > > > > > +#define TX_MARK_IP_ECN (1ULL << 2) > > > > > > + > > > > > > /** > > > > > > * The generic rte_mbuf, containing a packet mbuf. > > > > > > */ > > > > > > @@ -543,6 +552,10 @@ struct rte_mbuf { > > > > > > }; > > > > > > uint32_t inner_l4_type:4; /**< Inner L4 type. */ > > > > > > }; > > > > > > + struct { > > > > > > + uint32_t reserved:29; > > > > > > + uint32_t tx_mark:3; > > > > > > + }; > > > > > > }; > > > > > > > > > > > > > > > > > > > > > > > > Please correct me if this is not what you mean. > > > > > > > > > > I'm not a big fan of reusing Rx fields or flags for Tx. > > > > > It's not obvious for an application than adding a tx_mark will > > > > > overwrite the packet_type. I understand that the risk is limited > > > > > because packet_type is Rx and the marks are Tx, but there is still > > > > > one. > > > > > > > > I'm also not a big fan but just wanted to take this approach so > > > > that, it can both conserve space and also help fast path. > > > > Reusing Rx area is however not a new thing as is already followed > > > > for > > > > mbuf->txadapter field. > > > > > > Yes, and in my opinion this is something we should avoid when > > > possible, because it makes some features exclusive (ex: the big union > > > with sched/rss/adapter/usr/...). > > > > > > > Apart from documentation issue, Is there any other issue or future > > > > ramification with using Rx field's for Tx ? > > > > > > No, I don't see any other issue except the ones we already mentioned (doc, > > code clarity, ). > > > > > > > If it is only about documentation, then we can add more documentation > > to make things clear. > > > > > > > > > > To summarize the different proposed approaches (please correct me if > > I'm wrong): > > > > > > > > > > a- add 3 Tx mbuf flags > > > > > (-) consumes limited resource > > > > > > > > > > b- add 3 dynamic flags > > > > > (-) slower > > > > > > > > - Tx burst Vector implementation can't be done for this tx offload as > > > > offset keeps changing. > > > > > > A vector implementation can be done. But yes, it would be slower than > > > with a static flag. > > > > Very slow atleast in our HW as, we try to translate ol_flags to HW > > descriptor > > flags in addition to extra operations to be done like offset calculations > > etc. > > The dynamic flag offset is not subject to be changed after registration. > So, flag offset can be converted once into appropriate mask and stored > locally by PMD for further using in vector instructions. > The only difference - loaded variable mask instead of constant one, should > not affect performance too much. > With the cmpeq/shifts the and results can be converted to any desired > predefined mask (like static one). > > > > > So if we have fixed offsets, then it is easy to have static constant > > BTW, how this constant is loaded? > I suppose on x86 it would be rather mm_load?_si128(*constant array) - no > difference > between dynamic and static masks at all. BTW, there is the hint - to make > local copy of the > mask/offset in order to avoid cache-line concurrency in global variable > storage. > > > 128/256 bit words with offsets and use things like shuffle/table lookup to > > reorganize multiple mbuf flags to descriptor fields in a single instruction. > > Offsets in the HW descriptor remain the fixed ones, so shuffle would still > work OK. > Do you already have some vectorized implementation? It would be curious to > have a look at.
Agree that the constants need to be loaded from memory, but there is also a logic created onto transforming that data to descriptor which is not always straight forward. For example, see otx2_tx.c, // mbuf has tx offload field L2_LEN of 7 bits, L3_LEN of 9 bits which are not on byte boundary. // Below are the transformations done to get them to byte boundary. /* Get tx_offload for ol2, ol3, l2, l3 lengths from mbuf*/ /* * Operation result: * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7) * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7) */ asm volatile ("LD1 {%[a].D}[0],[%[in]]\n\t" : [a]"+w"(senddesc01_w1) : [in]"r"(mbuf0 + 2) : "memory"); /* * Operation Result: * E(47):L3_LEN(9):L2_LEN(7+z) * E(47):L3_LEN(9):L2_LEN(7+z) */ senddesc01_w1 = vshlq_n_u64(senddesc01_w1, 1); senddesc23_w1 = vshlq_n_u64(senddesc23_w1, 1); /* * Result: * E(48):L3_LEN(8):L2_LEN(z+7) * E(48):L3_LEN(8):L2_LEN(z+7) */ const int8x16_t tshft3 = { -1, 0, 8, 8, 8, 8, 8, 8, -1, 0, 8, 8, 8, 8, 8, 8, }; senddesc01_w1 = vshlq_u8(senddesc01_w1, tshft3); senddesc23_w1 = vshlq_u8(senddesc23_w1, tshft3); We cannot get all the above logic done runtime for every position of mbuf->l2_len if the field position keeps changing for every application run. > > I strongly share the concern about defining the static mbuf flags, > we should consider all ways to avoid doing this. > > WBR, Slava > > > > > > > > > > > > > > > c- add 1 Tx flag and union with Rx field > > > > > (-) exclusive with Rx field > > > > > (-) still consumes one flag > > > > > > > > > > My preference is still b-, for these reasons: > > > > > > > > > > - There are many different DPDK use cases, and resources in mbuf is > > tight. > > > > > Recent contributions (rte_flow and ice driver) already made use of > > dynamic > > > > > fields/flags. > > > > - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate > > > > for dynamic flags. > > > > > > I'm not sure to get why it is a better candidate than packet marking. > > > You mean because it requires more room in mbuf? > > > > Yes, I feel space consumption is one way to decide whether it should be a > > dynfield or static field. > > > > IMO, other parameter to judge could be whether the field definition/usage > > itself is well know standard and is a part of RTE spec or its definition is > > vendor specific. > > > > > > > > > - ICE PMD's dynamic field is however a vendor specific field and > > > > only for ICE PMD users. > > > > > > Yes, but ICE PMD users may be as important as packet marking users. > > > > Agree, I only meant that the flag ICE PMD registered cannot be used for > > other PMD's so by using dynamic field, we are avoiding wastage of a static > > field that is needed only by one specific PMD irrespective of whether that > > PMD is probed or not. > > > > > > > > > In this case, it is just 1 bit out of 18 free bits available in > > > > ol_flags. > > > > > > > > > > > > > > - When I implemented the dynamic fields/flags feature, I did a test > > which > > > > > showed that the cost of having a dynamic offset was few cycles (on > > my test > > > > > platform, it was~3 cycles for reading a field and ~2 cycles for > > > > > writing a > > > > > field). > > > > > > > > I think this cost is of the case where the address where the > > > > dyn_offset is stored is already in cache as it needs to be read first. > > > > > > This fetch of the value (in case it is not in cache) can be done once > > > per bulk, so I'm not sure the impact would be high. > > > > Agreed, for bulk case offset loading should have less impact. > > > > > > > > > > > Regards, > > > Olivier > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > Olivier > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + Techboard > > > > > > > > > > > > > > > > There is a related thread going on > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__eur03.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=5IzeBXssneRLE7EeAyOUnytITHfcmhQu-eRLr18e5U0&s=aCn4On5jrJAfrGkQm_ftPodlBQo3QiozQM76MU9S8j8&e= > > > > > > > > > > > > > > > > %2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp- > > 3A__ma > > > > > > > > ils.dpdk.org_archives_dev_2020- > > 2DMay_168810.html%26d%3DDwIGa > > > > > > > > > > Q%26c%3DnKjWec2b6R0mOyPaz7xtfQ%26r%3DFZ_tPCbgFOh18zwRPO9H0y > > D > > > > > > > > > > x8VW38vuapifdDfc8SFQ%26m%3DnyV4Rud03HW6DbWMpyvOCulQNkagmf > > o0w > > > > > > > > > > KtrwQ7zmmg%26s%3DVuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s > > > > > > > > > > %26e%3D&data=02%7C01%7Cviacheslavo%40mellanox.com%7C5e7a > > > > > > > > > > 9c93fd684e09267108d807bd0160%7Ca652971c7d2e4d9ba6a4d149256f4 > > > > > > > > > > 61b%7C0%7C0%7C637267855650797843&sdata=r%2B0JIDapZocl6DD > > > > > > > > A8wbNd4LV0CX6zEdDoQJhBpMoELw%3D&reserved=0 > > > > > > > > > > > > > > > > If there is no consensus on email, then I would like to add > > > > > > > > this item to the next TB meeting. > > > > > > > > > > > > > > Ok, I'll add that to tomorrow meeting agenda. > > > > > > > Konstantin > > > > > > > > > > > > > > > > >