Hi,

I've tried three different ConnectX-3 NICs now and they all behave the same.
To rule out any issues with the GM I tried a Intel i210 as well and that is spot on with excellent sync.

However, the Mellanox is not. It's almost as if there is a frequency correction happening inside the NIC every so often. Then the rms values are bad out of a sudden, getting better as frequency is adjusted to the GM and then, again, really bad.


ptp4l[447.094]: rms 1580 max 6078 freq +1598846 +/- 4029 delay   983 +/-   3
ptp4l[448.091]: rms 1608 max 6080 freq +1598430 +/- 4117 delay   971 +/-   2
ptp4l[449.089]: rms  111 max  244 freq +1598218 +/- 180 delay   977 +/-   2
ptp4l[450.086]: rms   25 max   39 freq +1598474 +/-  23 delay   983 +/-   2
ptp4l[451.084]: rms   30 max   57 freq +1598512 +/-  23 delay   984 +/-   1
ptp4l[452.081]: rms   17 max   28 freq +1598504 +/-  14 delay   984 +/-   1
ptp4l[453.078]: rms   13 max   24 freq +1598507 +/-  15 delay   985 +/-   2
ptp4l[454.076]: rms   14 max   43 freq +1598510 +/-  26 delay   984 +/-   1
ptp4l[455.073]: rms    7 max   16 freq +1598506 +/-  12 delay   985 +/-   1
ptp4l[456.071]: rms   11 max   18 freq +1598525 +/-  13 delay   983 +/-   1
ptp4l[457.068]: rms    7 max   15 freq +1598519 +/-  14 delay   983 +/-   1
ptp4l[458.066]: rms    6 max   14 freq +1598514 +/-  14 delay   984 +/-   1
ptp4l[459.063]: rms    6 max   13 freq +1598519 +/-  14 delay   985 +/-   0
ptp4l[460.061]: rms 1563 max 5997 freq +1598869 +/- 3991 delay   982 +/-   3


Thanks
Andre



On 20/11/23 10:27, Andre Puschmann wrote:
Hey,

 > How the GM side is configured? Are you writing system time to PHC
 > every second? If so, you can try make the phc free run. Without 1PPS
 > signal connecting to the phc or PTM enabled, it's not recommended to
 > set pmc's time by software, the jitter is quite big.

I am not writing any time to the PHC. I just start ptp4l. Shouldn't that be enough to adjust to PHC to the GMs?

 > Is the GM and the client connected directly or through a switch? Try
 > connect them directly with an utp or fiber.

The GM is directly connected to port0 of the NIC. And the GM is GPS synced.


 > Try the L2 transport. IIRC at least some Mellanox NICs performed
 > worse with UDP transport for some reason.

This is already with L2.


Meanwhile I tried with yet another ConnectX-3 card. This time a IBM branded with FW 2.42.5032 but the results are similar. One new thing I've observed is this:

ptp4l[226.813]: rms   69 max  164 freq +1593151 +/- 131 delay   972 +/-   3
ptp4l[227.811]: rms   31 max   40 freq +1593342 +/-  23 delay   977 +/-   1
ptp4l[228.810]: rms 1607 max 6189 freq +1593762 +/- 4095 delay   979 +/-   2 ptp4l[229.816]: rms 23001617550970840 max 65058399023124016 freq -11106166 +/- 33598711 delay   970 +/-   4 ptp4l[230.695]: clockcheck: clock jumped backward or running slower than expected! ptp4l[230.695]: port 1 (enp1s0): SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT ptp4l[230.821]: rms 65058399528662992 max 65058399969385488 freq -100000000 +/-   0 delay 9404287 +/- 6628445 ptp4l[231.825]: rms 65058400466045568 max 65058400899446072 freq -100000000 +/-   0 delay 15392460 +/- 2718826 ptp4l[232.828]: rms 65058401394850600 max 65058401833399432 freq -100000000 +/-   0 delay 12181777 +/- 2190097 ptp4l[233.831]: rms 65058402330812608 max 65058402771727544 freq -100000000 +/-   0 delay 16575665 +/- 1701941


RMS values were like before, but than suddenly increased and now don't go back.

Thanks
Andre


On 19/11/23 22:07, Andre Puschmann wrote:
Hey,

I've been able to get my hands on a ConnectX-3 Pro card and have done some initial testing. The card indeed has a shared PHC for both ports so running ptp4l as BC or TC does indeed work without the jbod option.

However, sync performance (i.e. rms values) for the downstream OCs isn't great. And in fact, even the Mellanox as a OC isn't giving great results - rms values jump a lot (and I've tried various PI value combinations).

Is anyone else seeing this with Mlx cards as well? Could it be my model or the firmware?

Here is the output of a OC config with the card:

$ sudo /opt/linuxptp/ptp4l -i enp1s0 -f ~/configs/ptp/oc.cfg -m -l6
ptp4l[12737.960]: selected /dev/ptp0 as PTP clock
ptp4l[12738.012]: port 1 (enp1s0): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[12738.012]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[12738.012]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[12738.060]: port 1 (enp1s0): new foreign master fcaf6a.fffe.02b447-1
ptp4l[12738.314]: selected best master clock fcaf6a.fffe.02b447
ptp4l[12738.314]: port 1 (enp1s0): LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[12740.148]: port 1 (enp1s0): minimum delay request interval 2^-4
ptp4l[12740.512]: port 1 (enp1s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[12741.138]: rms 1450 max 1934 freq +270168 +/- 1641 delay   951 +/-  14 ptp4l[12742.139]: rms  129 max  179 freq +268843 +/- 296 delay   963 +/-  11 ptp4l[12743.140]: rms  241 max  490 freq +268455 +/- 452 delay   948 +/-   1 ptp4l[12744.141]: rms  135 max  180 freq +268381 +/-  25 delay   947 +/-   1 ptp4l[12745.142]: rms 1357 max 5277 freq +269064 +/- 3459 delay   950 +/-   1 ptp4l[12746.143]: rms 1397 max 5092 freq +268197 +/- 3539 delay   935 +/-   7 ptp4l[12747.144]: rms  210 max  417 freq +268048 +/- 243 delay   942 +/-   3 ptp4l[12748.145]: rms   15 max   32 freq +268415 +/-  29 delay   947 +/-   2 ptp4l[12749.146]: rms 1430 max 5594 freq +269126 +/- 3617 delay   950 +/-   1 ptp4l[12750.147]: rms 1391 max 5162 freq +268252 +/- 3543 delay   942 +/-   4


Thanks
Andre





On 2/11/23 17:37, Jacob Keller wrote:


On 11/2/2023 4:15 AM, Andre Puschmann wrote:
Hi,

On 2/11/23 4:11, James Clark wrote:
I have a dual-port Mellanox ConnectX-3 (specifically MCX312A-XCBT),
which has a shared PHC. You can get them for less than $50 on
eBay/AliExpress. I had to upgrade the firmware on mine to get PTP
support. I haven't yet tried it as a boundary clock.

Excellent. This is very helpful James. I've ordered a MCX312A and B and
will compare both here. I'll share my results here soon. If you have a
chance please also share the firmware version you're currently using on
your NIC.

With my Intel NIC I could get the BC config working but I needed to set
the twoStepFlag to 1. Otherwise I was getting this for both ports:

ptp4l[1040.180]: ioctl SIOCSHWTSTAMP failed: Numerical result out of range


Yep, that would indicate the device doesn't support one-step mode.

Sync quality wasn't great as expected though. I'll repeat with the
Mellanox once I have them here.

Thanks
Andre


For Intel NICs, the only products I am aware of which share PHC across
the device are the E800 series devices. Prior devices (E500, and E700,
as well as the gigabit products) do share the same internal oscillator
but due to the register interface each function has to setup its own clock.

Thanks,
Jake


_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users



--
Andre Puschmann

Software Radio Systems (SRS)
https://www.srs.io
an...@srs.io

PGP/GnuPG key: 0x204A85DFEA324D58
fingerprint: 3924 1C60 D52E 81A2 1F2E 0C9D 204A 85DF EA32 4D58



_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to