Le 22/05/2020 à 20:43, PATRICK KEROULAS a écrit :
mlx5 part of libibverbs includes a ts-to-ns converter which takes the
instantaneous clock info. It's unused in dpdk so far. I've tested it in the
device/port init routine and the result looks reliable. Since this approach
looks very simple, compared to the time sync mechanism, I'm trying to
integrate.

The conversion should occur in the primary process (testpmd) I suppose.
1) The needed clock info derives from ethernet device. Is it possible to
    access that struct from a rx callback?
2) how to attach the nanosecond to mbuf so that `pdump` catches it?
    (workaround: copy `mbuf->udata64` in forwarded packets.)
3) any other idea?
The timestamp is carried in mbuf.
Then the conversion must be done by the ethdev caller (application or
any other upper layer).
What if the converter function needs a clock_info?
https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201
I'm affraid this info may change by the time the converter is called
by upper layer.
Indeed, the clock in the device is not an atomic one :)
We need to adjust the time conversion continuously.
I am not an expert of time synchronization, so I add more people Cc
who could help for having a precise timestamp.
Thanks Thomas.
Not sure this is a synchronization issue. We have dedicated processes
(linuxptp) to keep both NIC and sys clocks in sync with an external clock.
It is "just" a matter of unit conversion.

If it has to be performed in dpdk-pdump, I would need some help to
retrieve mlx5_clock_info from inside a secondary process. Looking at
mlx5_read_clock(), this info is extracted from ibv_context which looks
reachable in a primary process only (segfault, if I try in pdump).


I don't know about the integrated ts-to-ns, but we implemented a translation mechanism that mimics what NTP does in Linux to translate a given clock (TSC at first) to a wall time. You'll find more info at https://orbi.uliege.be/bitstream/2268/226257/1/thesis.pdf chapter 3.4.1.  This is an often forgotten matter, as we saw in real switches that the time spent in time-related VDSO is enormous.

We wanted to do a very precise capture too, se we made that clock able to synchronize itself with the ConnectX 5 internal clock as a base instead of TSC. FYI the clock in CX5 si running at 800MHz, so pure nanosecond is impossible, but close enough. It is for that purpose that I proposed the rte_eth_read_clock() patch in DPDK. We need to be able to read the current clock (like rdtsc() instruction for TSC) to compute the frequency.

The "converter" code is there : https://github.com/tbarbette/fastclick/blob/master/elements/userlevel/tscclock.cc, the source is configurable (TSC, rte_eth_read_clock, GPS meinberg clock, ...), the DPDK one is there : https://github.com/tbarbette/fastclick/blob/2ab021283b82d0b980551480c505ec8dced98e0a/elements/userlevel/dpdkdevclock.cc#L27

One important thing is that the conversion factor must be changed from time to time to fix the drifiting. That is the reason why we can't just push a bunch of code to DPDK (and it's probably not as simple as using the ts-to-ns in mlx5) because you must have a timer, and use a RCU to update "atomically" a > 64bits struct. Though most of that is available now in DPDK but there will always be some setup (rcu barrier, timer init, ...).

In the end it's not hard science... It worked like a charm to do a campus trace capture on 100G hardware.

Reply via email to