On 13 Nov 2017 at 8:47, Richard Cochran wrote: > On Mon, Nov 13, 2017 at 01:54:32PM +0100, Frantisek Rysanek wrote: > > [...] > > I first tried your software on the stock kernel in Debian 8 > > (3.16.something if memory serves) - and the "timed out while ..." > > errors prompted me to upgrade to the latest vanilla, which happens to > > be 4.13.12 at the time of this writing. > > The errors are now less frequent, but they do occur. > > Okay, so maybe the remaining errors are due to latency within your > system. The default tx_timestamp_timeout is 1 millisecond. Try 10 > and see if that fixes the problem. > > > Notice how the errors become more frequent after 9 a.m. > > - I came to the machine and started a PCAP sniffer > > on that same port. > > That fact supports the idea that the cause is latency. > Apologies for not responding for almost a month... Other work interfering :-)
I'm back in the lab with a slightly different GrandMaster to play with and some more time on my hands. Still the same Intel PC. Even if I increase the tx_timestamp_timeout to 20 ms, the TX timeouts still happen. Interestingly, in the lab, against a directly attached GM, the timeouts are relatively rare. Two weeks ago I was on a field trip with my GrandMasters, and on site my i219LM talked to the Meinberg GM through a RuggedCom switch. It all looked fairly good, the protocol clockwork seemd to work, the switch was evidently doing its job (the correction field was non-zero) etc. But, once I started ptp4l in the HW-accelerated mode, the TX timeouts happened so often that the client was pretty much useless even as a "test traffic generator". It kept falling over all the time! I ended up resorting to the software mode, at the price of some 4 decimal orders in precision :-( Well the purpose really was to generate some test traffic, so thanks god for the software-only mode :-) It seems that I haven't shared some details of my config yet: the GrandMasters that I'm playing with are destined for IEC61850 deployment (power substation) and so they're configured for L2 multicast mode, and the switch needs to be a TC with peer delay measurement in P2P mode. So the PDelay messages are exchanged between immediate neighbors on physical Ethernet links (they don't pass through the switch). The GrandMaster's Announce, Sync and Follow-up do pass through the switch. I believe only the Follow-ups have a correction field, and that is non-zero. The "correction" field inserted by the RuggedCom switch contains values between 10 and 20 million raw units, that's some 150 to 300ns. Sounds about appropriate. Makes me wonder if the contents of the PTP traffic can make the Intel hardware puke :-/ The actual jitter, or the non-zero correction field... it's strange. Frank Rysanek ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel