Hi Jake, thanks for responding!
So this sounds like a problem of the clock switching around the TAI/UTC
conversion, and then ptp4l later tries to correct this by maxing the frequency
slew..?
Indeed, that is exactly what it looks like. But then, why does it switch around
the conversion from TAI/UTC when PXE booting other servers. When later trying
to correct this by maxing the frequency slew and at the moment it is nearly
synchronised again: PTP receives clock checks which makes PTP to never recover
itself.
clock check messages should only be happening if some external process is also
tuning the clock.
Exactly what I thought, but it makes no sense since because there is no other
time service running or whatsoever. Besides, when the clock check messages
start to occur, they come very fast. With that I mean every master offset
output is followed by almost 5 clock check messages in between, like the clock
check messages are returned every 0.2 seconds. This seems like a very weird
behaviour.
And, like I said before, when increasing the tx_timestamp_timeout value to
200ms there is no any problem, also no clock check messages.. So I doubt what
could be the problem there..
Jord
On 2 Aug 2018, at 01:55, Keller, Jacob E
<jacob.e.kel...@intel.com<mailto:jacob.e.kel...@intel.com>> wrote:
-----Original Message-----
From: Jord Pool [mailto:jord.p...@outlook.com]
Sent: Wednesday, August 01, 2018 12:27 AM
To: Richard Cochran
<richardcoch...@gmail.com<mailto:richardcoch...@gmail.com>>; Keller, Jacob E
<jacob.e.kel...@intel.com<mailto:jacob.e.kel...@intel.com>>; Cliff Spradlin
<csprad...@waymo.com<mailto:csprad...@waymo.com>>; Chris Caudle
<ch...@chriscaudle.org<mailto:ch...@chriscaudle.org>>; Cliff Spradlin via
Linuxptp-users <linuxptp-
us...@lists.sourceforge.net<mailto:us...@lists.sourceforge.net>>
Subject: PXE Boot PTP Issues
Good morning !
As I explained the issues with the PTP slave which is a PXE Boot server at the
same time last week, where the message occurs which says to increase the
tx_timestamp_timeout or the issue being likely a driver bug, I have installed
the
latest sourceforge e1000e driver (version 3.4.1.1) which does not solve the
problem.
Now as we said it is not per se a driver bug. Due to the increase in network
traffic
(we assumed) the PTP slave instance will be interrupted. However, I am still
left
with a question.
At the moment the PTP slave gets interrupted due to its increase in network
traffic being sent from the same server, the PTP slave instance everytime
receives an offset of 36 seconds (TAI / UTC conversion?). Then; the PTP instance
tries to slew this down but right at the moment it is nearly properly aligned
again
there occurs a ‘clock check’ message and the offset shoots up to 70+ seconds
and won’t recover anymore; returning only clock check messages every second
and offsets which only drift further away.
So this sounds like a problem of the clock switching around the TAI/UTC
conversion, and then ptp4l later tries to correct this by maxing the frequency
slew..?
This latter described behaviour is what bugs us the most, that PTP is unable to
recover itself and is only drifting even further away. How come PTP is
interrupted
by network increase, gaining a 36 second offset, slewing down and then when it
nearly recovers returns a clock check message and shoots up its offset and never
recovers again?
clock check messages should only be happening if some external process is also
tuning the clock.
If anyone could help me out on this that would be great! I am already working on
this for several days and can’t find a clue on how to solve this..
Jord
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users