On Thu, 24 Jul 2025, Yoon Junghan wrote:

Hello,
As advised, I checked the TCP timestamp and found that the RX HW timestamp 
appears to be replaced by the TCP timestamp in LROed packets. This is a very 
weird bug...
 
But is this observed exclusively on the paths where the sender side enables TCP
timestamp? Generally speaking, since TCP packets may arrive out-of-order, one
may want to prioritise TCP timestamp over HW timestamp for a LROed frame. So may
be not quite a bug, but a design decision? What does the NIC documentation say?

Thank you.

[port 0] RX HW timestamp: 0x00001208f2a8d6c7, TCP timestamp: 0x669efc3b6a269469 
(tsval: 1721695291, tsecr: 1780913257) (not LROed)
[port 0] RX HW timestamp: 0x669efc3b6a26946a, TCP timestamp: 0x669efc3b6a26946a 
(tsval: 1721695291, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x669efc3b6a26946a, TCP timestamp: 0x669efc3b6a26946a 
(tsval: 1721695291, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x669efc3b6a26946a, TCP timestamp: 0x669efc3b6a26946a 
(tsval: 1721695291, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x00001208f2b05907, TCP timestamp: 0x669efc3b6a26946a 
(tsval: 1721695291, tsecr: 1780913258) (not LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946a, TCP timestamp: 0x669efc3c6a26946a 
(tsval: 1721695292, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946a, TCP timestamp: 0x669efc3c6a26946a 
(tsval: 1721695292, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946a, TCP timestamp: 0x669efc3c6a26946a 
(tsval: 1721695292, tsecr: 1780913258) (LROed)
[port 0] RX HW timestamp: 0x00001208f2b8f191, TCP timestamp: 0x669efc3c6a26946a 
(tsval: 1721695292, tsecr: 1780913258) (not LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946b, TCP timestamp: 0x669efc3c6a26946b 
(tsval: 1721695292, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946b, TCP timestamp: 0x669efc3c6a26946b 
(tsval: 1721695292, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x669efc3c6a26946b, TCP timestamp: 0x669efc3c6a26946b 
(tsval: 1721695292, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x00001208f2c0689b, TCP timestamp: 0x669efc3c6a26946b 
(tsval: 1721695292, tsecr: 1780913259) (not LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946b, TCP timestamp: 0x669efc3d6a26946b 
(tsval: 1721695293, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946b, TCP timestamp: 0x669efc3d6a26946b 
(tsval: 1721695293, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946b, TCP timestamp: 0x669efc3d6a26946b 
(tsval: 1721695293, tsecr: 1780913259) (LROed)
[port 0] RX HW timestamp: 0x00001208f2c83627, TCP timestamp: 0x669efc3d6a26946b 
(tsval: 1721695293, tsecr: 1780913259) (not LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946c, TCP timestamp: 0x669efc3d6a26946c 
(tsval: 1721695293, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946c, TCP timestamp: 0x669efc3d6a26946c 
(tsval: 1721695293, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x669efc3d6a26946c, TCP timestamp: 0x669efc3d6a26946c 
(tsval: 1721695293, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x00001208f2cfa33b, TCP timestamp: 0x669efc3d6a26946c 
(tsval: 1721695293, tsecr: 1780913260) (not LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946c, TCP timestamp: 0x669efc3e6a26946c 
(tsval: 1721695294, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946c, TCP timestamp: 0x669efc3e6a26946c 
(tsval: 1721695294, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946c, TCP timestamp: 0x669efc3e6a26946c 
(tsval: 1721695294, tsecr: 1780913260) (LROed)
[port 0] RX HW timestamp: 0x00001208f2d8605d, TCP timestamp: 0x669efc3e6a26946c 
(tsval: 1721695294, tsecr: 1780913260) (not LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946d, TCP timestamp: 0x669efc3e6a26946d 
(tsval: 1721695294, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946d, TCP timestamp: 0x669efc3e6a26946d 
(tsval: 1721695294, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x669efc3e6a26946d, TCP timestamp: 0x669efc3e6a26946d 
(tsval: 1721695294, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x00001208f2dfd977, TCP timestamp: 0x669efc3f6a26946d 
(tsval: 1721695295, tsecr: 1780913261) (not LROed)
[port 0] RX HW timestamp: 0x669efc3f6a26946d, TCP timestamp: 0x669efc3f6a26946d 
(tsval: 1721695295, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x669efc3f6a26946d, TCP timestamp: 0x669efc3f6a26946d 
(tsval: 1721695295, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x669efc3f6a26946d, TCP timestamp: 0x669efc3f6a26946d 
(tsval: 1721695295, tsecr: 1780913261) (LROed)
[port 0] RX HW timestamp: 0x00001208f2e78fab, TCP timestamp: 0x669efc3f6a26946d 
(tsval: 1721695295, tsecr: 1780913261) (not LROed)

Sincerely,
Junghan Yoon
On Jul 24, 2025, 5:40 PM +0900, Ivan Malov <ivan.ma...@arknetworks.am>, wrote:
      On Thu, 24 Jul 2025, Yoon Junghan wrote:

            I found the key difference: when TCP timestamps (RFC 7323) are 
enabled on the TCP sender, the RX HW timestamp of LROed packets on the DPDK PMD 
on middlebox
            machine becomes inconsistent
            or invalid. Is there a known limitation or erratum in the mlx5 
driver or CX7 firmware regarding this?


      Very interesting observation. So did you have TCP timestamps enabled on 
those
      sender NICs that match the receiver NICs that would fail to do HW 
timestamp
      and did you have it disabled on links where HW timestamp was OK on 
receiver?

      But in the case of LRO on such NICs that have wrong HW timestamp, is the 
TCP
      timestamp option present in a LROed packet and does it have accurate 
value?

      Regarding erratum, - I have to confess I'm not an expert in this 
particular
      driver. May be they have some official documentation regarding this 
somewhere.

      Thank you.


            Sincerely,
            Junghan Yoon
            On Jul 24, 2025, 12:26 AM +0900, Ivan Malov 
<ivan.ma...@arknetworks.am>, wrote:
            On Thu, 24 Jul 2025, Yoon Junghan wrote:

            I'm not sure I did well. All interface show the same result.
             
            current settings:
            tx_type 0
            rx_filter 0


            But that should mean.. no timestamping?

            1) May be also check 'sudo ethtool -T <ifname>'.
            2) May be try to enable 'sudo hwstamp_ctl -i <ifname> -r 1'.

            Thank you.


            Sincerely,
            Junghan Yoon
            On 2025년 7월 23일 PM 11:19 +0900, Ivan Malov 
<ivan.ma...@arknetworks.am>, wrote:
            On Wed, 23 Jul 2025, Yoon Junghan wrote:

            I isolated port 1 using -a option for EAL parameter and got the 
similar result.
             
            Note that port 1 becomes port 0 in this time.
            [port 0] RX HW timestamp: 0x3eac4214bc574368 (LROed)
            [port 0] RX HW timestamp: 0x3eac4214bc574368 (LROed)
            [port 0] RX HW timestamp: 0x3eac4214bc574368 (LROed)
            [port 0] RX HW timestamp: 0x3eac4214bc574368 (LROed)
            [port 0] RX HW timestamp: 0x00042819272fad (not LROed)
            [port 0] RX HW timestamp: 0x000428192e6e77 (not LROed)
            [port 0] RX HW timestamp: 0x000428192e7f01 (not LROed)
            [port 0] RX HW timestamp: 0x000428192e833d (not LROed)
             
            FYI, I have 4 CX-7 on the same machine. (eth0 = port 0, ... eth3 = 
port 3 in DPDK)
            pci@0000:16:00.0  eth0             network        MT2910 Family 
[ConnectX-7]
            pci@0000:40:00.0  eth1             network        MT2910 Family 
[ConnectX-7]
            pci@0000:6a:00.0  eth2             network        MT2910 Family 
[ConnectX-7]
            pci@0000:94:00.0  eth3             network        MT2910 Family 
[ConnectX-7]
             
            Among them, only the first CX-7 shows consistent timestamp 
regardless of LRO.


            Does 'sudo hwstamp_ctl -i <ifname>' show consistent results across 
all the NICs?

            Thank you.


            Sincerely,
            Junghan Yoon
            On 2025년 7월 23일 PM 10:28 +0900, Ivan Malov 
<ivan.ma...@arknetworks.am>, wrote:
            On Wed, 23 Jul 2025, Yoon Junghan wrote:

            Thank you for quick response.
             
            1) They are different NICs. Not in the same board. Separate 
adapters in different PCIe slots.
            2) My DPDK app uses 4 separate ports; port 0, port 1, port 2, and 
port 3. They are all on different boards. Thus, they are running at the same 
time.


            Excellent. I apologise for one more dumb question, but does 
isolating the very
            specific NIC (so that DPDK does not grab the other ones) that is 
known to give
            strange timestamps, result in the same/unexpected behaviour? Just 
to make sure.

            Thank you.


            Sincerely,
            Junghan Yoon
            On 2025년 7월 23일 PM 10:09 +0900, Ivan Malov 
<ivan.ma...@arknetworks.am>, wrote:
             
            Hello,

            On Wed, 23 Jul 2025, Yoon Junghan wrote:

             
            Hello,
            As advised, I tested hardware timestamps with LRO enabled on our 
ConnectX-7 NICs. However, the timestamps of LROed packets still show 
inconsistent and abnormally
            large
            gaps from normal
            packets.
             
            Interestingly, I found this issue does not appear on all CX-7 NICs. 
Even with identical DPDK code, firmware version (28.43.2566), and hardware 
models from the
            same
            manufacturer, only
            specific NICs exhibit this inconsistency.
            I have confirmed that:
            * All NICs use the same driver and firmware version.
            * All NICs are of the same model (MCX75310AAS-NEA_Ax).
             


            1) Do the two "NICs" ('port 0' and 'port 1' from below printout) 
represent two
            different ports/PFs of the same physical 'board'/'adapter card' in 
fact?

            2) If (1) is true, were the results obtained by running the 
application on both
            ports simultaneously (both managed by the DPDK at the same time)?

            (just to clarify, -- I'm confused by the fact that the NIC driver 
itself seems
            to invoke 'rte_mbuf_dyn_rx_timestamp_register' for each new RxQ 
rather than call
            it once and then look-up and reuse the existing offsets for more 
ports/queue ).

            Thank you.

             
            * The issue occurs only when LRO is enabled together with RX 
hardware timestamping.
            * Disabling LRO eliminates the issue.
            I would appreciate any insight into how this behavior can occur on 
only some ports despite same software and hardware setup.
             
            Below is my code snippet.
             
            ```c
            
/*----------------------------------------------------------------------------*/
            static inline int
            is_timestamp_enabled(struct rte_mbuf *mbuf)
            {
               static uint64_t timestamp_rx_dynflag = 0;
               int timestamp_rx_dynflag_offset;
             
               if (!timestamp_rx_dynflag)
               {
                   timestamp_rx_dynflag_offset =
                       
rte_mbuf_dynflag_lookup(RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
                   if (timestamp_rx_dynflag_offset < 0)
                   {
                       return 0;
                   }
                   timestamp_rx_dynflag = 
RTE_BIT64(timestamp_rx_dynflag_offset);
               }
             
               return mbuf->ol_flags & timestamp_rx_dynflag;
            }
            
/*----------------------------------------------------------------------------*/
            static inline rte_mbuf_timestamp_t *
            get_timestamp(struct rte_mbuf *mbuf)
            {
               static int timestamp_dynfield_offset = -1;
             
               if (timestamp_dynfield_offset < 0)
               {
                   timestamp_dynfield_offset =
                       
rte_mbuf_dynfield_lookup(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
                   if (timestamp_dynfield_offset < 0)
                   {
                       return 0;
                   }
               }
             
               return RTE_MBUF_DYNFIELD(mbuf,
                                         timestamp_dynfield_offset,
                                         rte_mbuf_timestamp_t *);
            }
            
/*----------------------------------------------------------------------------*/
            static inline rte_mbuf_timestamp_t *
            get_rx_hw_timestamp(struct rte_mbuf *pkt)
            {
               if (!is_timestamp_enabled(pkt))
               {
                   printf("rx_hw_timestamp not enabled in mbuf!\n");
                   return NULL;
               }
             
               return get_timestamp(pkt);
            }
            ```
             
            My DPDK application prints logs as below.
             
            ```c
               /* parse HW timestamp */
               rte_mbuf_timestamp_t *rx_timestamp = get_rx_hw_timestamp(pkt);
               printf("[port %d] RX HW timestamp: %#016lx %s\n",
                      pctx->port_id,
                      *rx_timestamp,
                      pkt->ol_flags & PKT_RX_LRO ? "(LROed)" : "(not LROed)");
            ```
             
            Below are observations from two CX-7 ports under identical 
conditions.
             
            Normal NIC (port 0):
            [port 0] RX HW timestamp: 0x00007dcd2d185b (LROed)
            [port 0] RX HW timestamp: 0x00007dcd2d1911 (LROed)
            [port 0] RX HW timestamp: 0x00007dcd2d19c9 (LROed)
            [port 0] RX HW timestamp: 0x00007dcd2d37ca (LROed)
            [port 0] RX HW timestamp: 0x00007dcd2d4cb3 (not LROed)
            [port 0] RX HW timestamp: 0x00007dcd2d4cb3 (not LROed)
            [port 0] RX HW timestamp: 0x00007dcd30e019 (not LROed)
            [port 0] RX HW timestamp: 0x00007dcd3280bb (not LROed)
             
            Erroneous NIC (port 1):
            Below is erroneous NIC's timestamp.
            [port 1] RX HW timestamp: 0x3e6eef91bc19f0fd (LROed)
            [port 1] RX HW timestamp: 0x3e6eef91bc19f0fd (LROed)
            [port 1] RX HW timestamp: 0x3e6eef91bc19f0fd (LROed)
            [port 1] RX HW timestamp: 0x3e6eef91bc19f0fd (LROed)
            [port 1] RX HW timestamp: 0x000080691b7557 (not LROed)
            [port 1] RX HW timestamp: 0x000080691e2311 (not LROed)
            [port 1] RX HW timestamp: 0x00008069357553 (not LROed)
            [port 1] RX HW timestamp: 0x0000806936e8c1 (not LROed)

            As shown above, non-LRO packets consistently have normal hardware 
timestamps on both NICs. However, on port 1, all LROed packets return a fixed, 
invalid timestamp
            (0x3e6eef91bc19f0fd),
            which is clearly inconsistent.
            I have also confirmed that other dynfields (rather than dynfield[1] 
and dynfield[2]) are unused.
             

            Sincerely,
            Junghan Yoon
            On Jul 22, 2025, 5:31 PM +0900, Ivan Malov 
<ivan.ma...@arknetworks.am>, wrote:
            Hello,

            On Tue, 22 Jul 2025, Yoon Junghan wrote:

            Hello,
             
            I'm currently using DPDK 20.11 with a ConnectX-7 NIC, and I'm 
trying to retrieve RX hardware timestamps using 
`rte_mbuf_dyn_rx_timestamp_register()`.


            Does the application invoke 'rte_mbuf_dyn_rx_timestamp_register' on 
its own? If
            yes, consider to replace this with invocations of APIs [1] (with 
field name [2])
            and [3] (with flag name [4]). For an example, please refer to [5] 
and [6].

            This is because, as per [7], the driver in question might 
'register' the field
            and the flag on its own, in response to 'DEV_RX_OFFLOAD_TIMESTAMP' 
request, so,
            the user application should look up the field/flag, not 'register' 
it afresh.

            If this does not help, then consider to clarify whether the 
timestamps are
            accurate (and whether the flag is seen in the mbufs) when LRO is 
not enabled.

            [1] 
https://doc.dpdk.org/api-20.11/rte__mbuf__dyn_8h.html#a6adf9b352a83e7d521fd6aa04e305b1c
            [2] 
https://doc.dpdk.org/api-20.11/rte__mbuf__dyn_8h.html#a5159b2d34fa801d171ed0ccce451121b
            [3] 
https://doc.dpdk.org/api-20.11/rte__mbuf__dyn_8h.html#a89d835027034f76a27eb2afe7987ae35
            [4] 
https://doc.dpdk.org/api-20.11/rte__mbuf__dyn_8h.html#a831d7066c7193788351797a65186848a
            [5] 
https://github.com/DPDK/dpdk/blob/d69724b1dcc69784bcef00b96597469b7f6e6207/app/test-pmd/util.c#L44
            [6] 
https://github.com/DPDK/dpdk/blob/d69724b1dcc69784bcef00b96597469b7f6e6207/app/test-pmd/util.c#L60
            [7] 
https://github.com/DPDK/dpdk/blob/d69724b1dcc69784bcef00b96597469b7f6e6207/drivers/net/mlx5/mlx5_rxq.c#L1743

            Thank you.

             
            When LRO is enabled, I notice that LROed mbufs seem to share 
identical timestamp values, and the timestamps are unexpectedly large or 
inconsistent. This raises
            the question of whether
            LRO is interfering with the correctness of the RX HW timestamps.
             
            I’d appreciate any clarification on whether HW RX timestamping is 
reliable when LRO is enabled on this platform, or if LRO should be just 
disabled for accurate
            per-packet timestamping.
             

            Sincerely,
            Junghan Yoon


             






Reply via email to