Hi, Morten Addressed most of your comments in the v5 commit message. Header file comments are close to become too wordy, and I did not dare to elaborate ones more.
With best regards, Slava > -----Original Message----- > From: Morten Brørup <m...@smartsharesystems.com> > Sent: Wednesday, July 8, 2020 18:27 > To: Slava Ovsiienko <viachesl...@mellanox.com>; dev@dpdk.org > Cc: Matan Azrad <ma...@mellanox.com>; Raslan Darawsheh > <rasl...@mellanox.com>; olivier.m...@6wind.com; > bernard.iremon...@intel.com; tho...@mellanox.net > Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet > Txscheduling > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slava Ovsiienko > > Sent: Wednesday, July 8, 2020 4:54 PM > > > > Hi, Morten > > > > Thank you for the comments. Please, see below. > > > > > -----Original Message----- > > > From: Morten Brørup <m...@smartsharesystems.com> > > > Sent: Wednesday, July 8, 2020 17:16 > > > To: Slava Ovsiienko <viachesl...@mellanox.com>; dev@dpdk.org > > > Cc: Matan Azrad <ma...@mellanox.com>; Raslan Darawsheh > > > <rasl...@mellanox.com>; olivier.m...@6wind.com; > > > bernard.iremon...@intel.com; tho...@mellanox.net > > > Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate > > packet > > > Txscheduling > > > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Viacheslav > > > > Ovsiienko > > > > Sent: Tuesday, July 7, 2020 4:57 PM > > > > > > > > There is the requirement on some networks for precise traffic > > timing > > > > management. The ability to send (and, generally speaking, receive) > > the > > > > packets at the very precisely specified moment of time provides > > > > the opportunity to support the connections with Time Division > > Multiplexing > > > > using the contemporary general purpose NIC without involving an > > > > auxiliary hardware. For example, the supporting of O-RAN Fronthaul > > > > interface is one of the promising features for potentially usage > > > > of the precise time management for the egress packets. > > > > > > > > The main objective of this RFC is to specify the way how > > applications > > > > can provide the moment of time at what the packet transmission > > > > must > > be > > > > started and to describe in preliminary the supporting this feature > > > > from > > > > mlx5 PMD side. > > > > > > > > The new dynamic timestamp field is proposed, it provides some > > timing > > > > information, the units and time references (initial phase) are not > > > > explicitly defined but are maintained always the same for a given > > port. > > > > Some devices allow to query rte_eth_read_clock() that will return > > the > > > > current device timestamp. The dynamic timestamp flag tells whether > > the > > > > field contains actual timestamp value. For the packets being sent > > this > > > > value can be used by PMD to schedule packet sending. > > > > > > > > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation > > and > > > > obsoleting, these dynamic flag and field will be used to manage > > > > the timestamps on receiving datapath as well. > > > > > > > > When PMD sees the "rte_dynfield_timestamp" set on the packet being > > > > sent it tries to synchronize the time of packet appearing on the > > wire > > > > with the specified packet timestamp. If the specified one is in > > > > the past it should be ignored, if one is in the distant future it > > should > > > > be capped with some reasonable value (in range of seconds). These > > > > specific cases ("too late" and "distant future") can be optionally > > > > reported via device xstats to assist applications to detect the > > > > time-related problems. > > > > > > > > There is no any packet reordering according timestamps is > > > > supposed, neither within packet burst, nor between packets, it is > > > > an entirely application responsibility to generate packets and its > > > > timestamps > > in > > > > desired order. The timestamps can be put only in the first packet > > in > > > > the burst providing the entire burst scheduling. > > > > > > > > PMD reports the ability to synchronize packet sending on timestamp > > > > with new offload flag: > > > > > > > > This is palliative and is going to be replaced with new eth_dev > > > > API about reporting/managing the supported dynamic flags and its > > related > > > > features. This API would break ABI compatibility and can't be > > > > introduced at the moment, so is postponed to 20.11. > > > > > > > > For testing purposes it is proposed to update testpmd "txonly" > > > > forwarding mode routine. With this update testpmd application > > > > generates the packets and sets the dynamic timestamps according to > > > > specified time pattern if it sees the "rte_dynfield_timestamp" is > > registered. > > > > > > > > The new testpmd command is proposed to configure sending pattern: > > > > > > > > set tx_times <burst_gap>,<intra_gap> > > > > > > > > <intra_gap> - the delay between the packets within the burst > > > > specified in the device clock units. The number > > > > of packets in the burst is defined by txburst > > parameter > > > > > > > > <burst_gap> - the delay between the bursts in the device clock > > units > > > > > > > > As the result the bursts of packet will be transmitted with > > specific > > > > delays between the packets within the burst and specific delay > > between > > > > the bursts. The rte_eth_get_clock is supposed to be engaged to get > > the > > > > current device clock value and provide the reference for the > > > > timestamps. > > > > > > > > Signed-off-by: Viacheslav Ovsiienko <viachesl...@mellanox.com> > > > > --- > > > > v1->v4: > > > > - dedicated dynamic Tx timestamp flag instead of shared with > > > > Rx > > > > > > The detailed description above should be updated to reflect that it > > is now > > > two flags. > > OK > > > > > > > > > - Doxygen-style comment > > > > - comments update > > > > > > > > --- > > > > lib/librte_ethdev/rte_ethdev.c | 1 + > > lib/librte_ethdev/rte_ethdev.h > > > > | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h | 31 > > > > +++++++++++++++++++++++++++++++ > > > > 3 files changed, 36 insertions(+) > > > > > > > > diff --git a/lib/librte_ethdev/rte_ethdev.c > > > > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644 > > > > --- a/lib/librte_ethdev/rte_ethdev.c > > > > +++ b/lib/librte_ethdev/rte_ethdev.c > > > > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off { > > > > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO), > > > > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO), > > > > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM), > > > > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP), > > > > }; > > > > > > > > #undef RTE_TX_OFFLOAD_BIT2STR > > > > diff --git a/lib/librte_ethdev/rte_ethdev.h > > > > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644 > > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > > @@ -1178,6 +1178,10 @@ struct rte_eth_conf { > > > > /** Device supports outer UDP checksum */ #define > > > > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000 > > > > > > > > +/** Device supports send on timestamp */ #define > > > > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000 > > > > + > > > > + > > > > #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP > 0x00000001 > > > /**< > > > > Device supports Rx queue setup after device started*/ #define > > > > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff -- > git > > > > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h > > > > index 96c3631..7e9f7d2 100644 > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h > > > > @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char > *name, > > > > #define RTE_MBUF_DYNFIELD_METADATA_NAME > > > "rte_flow_dynfield_metadata" > > > > #define RTE_MBUF_DYNFLAG_METADATA_NAME > > > "rte_flow_dynflag_metadata" > > > > > > > > +/** > > > > + * The timestamp dynamic field provides some timing information, > > the > > > > + * units and time references (initial phase) are not explicitly > > > > defined > > > > + * but are maintained always the same for a given port. Some > > devices > > > > allow4 > > > > + * to query rte_eth_read_clock() that will return the current > > device > > > > + * timestamp. The dynamic Tx timestamp flag tells whether the > > field > > > > contains > > > > + * actual timestamp value. For the packets being sent this value > > can > > > > be > > > > + * used by PMD to schedule packet sending. > > > > + * > > > > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field > > deprecation > > > > + * and obsoleting, the dedicated Rx timestamp flag is supposed to > > be > > > > + * introduced and the shared dynamic timestamp field will be used > > > > + * to handle the timestamps on receiving datapath as well. > > > > + */ > > > > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME > > > "rte_dynfield_timestamp" > > > > > > The description above should not say anything about the dynamic TX > > > timestamp flag. > > It does not. Or do you mean RX? > > Not sure, field and flag are tightly coupled, it is nice to mention > > this relation for better understanding. > > And mentioning the RX explains why it is not like this: > > RTE_MBUF_DYNFIELD_[TX]_TIMESTAMP_NAME > > Sorry. I misunderstood its purpose! > It's the name of the field, and the field will not only be used for RX, but > in the > future also for RX. > (I thought it was the name of the RX flag, reserved for future use.) > > > > > > > > > Please elaborate "some timing information", e.g. add "... about when > > the > > > packet was received". > > > > Sorry, I do not follow, currently the dynamic field is not "about > > when the packet was received". Now it is introduced for Tx only and > > just the opportunity to be shared with Rx one in coming releases is > > mentioned. "Some" means - not specified (herein) exactly. > > And it is elaborated what Is not specified and how it is supposed to > > use Tx timestamp. > > It should be described when it is valid, and how it is being used, e.g. by > adding a reference to the "rte_dynflag_tx_timestamp" flag. > > > > > > > > + > > > > +/** > > > > + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME > flag > > > set on > > > > the > > > > + * packet being sent it tries to synchronize the time of packet > > > > appearing > > > > + * on the wire with the specified packet timestamp. If the > > specified > > > > one > > > > + * is in the past it should be ignored, if one is in the distant > > > > future > > > > + * it should be capped with some reasonable value (in range of > > > > seconds). > > > > + * > > > > + * There is no any packet reordering according to timestamps is > > > > supposed, > > > > + * neither for packet within the burst, nor for the whole bursts, > > it > > > > is > > > > + * an entirely application responsibility to generate packets and > > its > > > > + * timestamps in desired order. The timestamps might be put only > > in > > > > + * the first packet in the burst providing the entire burst > > > > scheduling. > > > > + */ > > > > +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME > > > "rte_dynflag_tx_timestamp" > > > > + > > > > #endif > > > > -- > > > > 1.8.3.1 > > > > > > > > > > It may be worth adding some documentation about how the clocks of > > > the NICs are out of sync with the clock of the CPU, and are all > > > drifting > > relatively. > > > > > > And those clocks are also out of sync with the actual time (NTP > > clock). > > > > IMO, It is out of scope of this very generic patch. As for mlx NICs - > > the internal device clock might be (or might be not) synchronized with > > PTP, it can provide timestamps in real nanoseconds in various formats > > or just some free running counter. > > Cool! > > > On some systems the NIC and CPU might share the same clock source (for > > their PLL inputs for example) and there will be no any drifts. As we > > can see - it is a wide and interesting opic to discuss, but, IMO, the > > comment in header file might be not the most relevant place to do. As > > for mlx5 devices clock specifics - it will be documented in PMD > > chapter. > > > > OK, will add few generic words, the few ones - in order not to make > > comment wordy, just point the direction for further thinking. > > I agree - we don't want cookbooks in the header files. Only enough > description to avoid the worst misunderstandings. > > > > > > > > > Preferably, some sort of cookbook for handling this should be > > provided. > > > PCAP could be used as an example. > > > > > testpmd example is included in series, mlx5 PMD patch is prepared and > > coming soon. > > Great. > > And I suppose that the more detailed cookbook/example - regarding offset > and drift of various clocks - is probably more relevant for the RX side (for > various PCAP applications), and thus completely unrelated to this patch. > > > > > With best regards, Slava