On Mon, Jan 29, 2018 at 05:17:48PM +0100, Thibaut BEYLER wrote: > I recently investigated some server on a site that struggle to use HW > timestamp and spend most of their time in deamon/deamon mode instead of > hardware/kernel. The weird part is that it occurs only with source 1, the > client kernels are stable with hw/kernel on the less acurate sources.
It's a race condition between receiving the TX timestamp of the client request and receiving the response from the server. If the response is so fast that it is received before the TX timestamp of the request, the late timestamp will be ignored as there will not be a corresponding request to which it could be applied. What NIC do the clients have? I've seen this with an Intel card. It happened only for a minority of requests and they were all dropped due to failing the test C, so overall it worked well with HW timestamping. I'm not sure if it should be treated as a driver/HW issue or if applications should really be expected to get TX timestamps so late. I asked about this on the Intel development list some time ago, but didn't get a response. I think it could be addressed in chrony by introducing a new timeout for timestamps, but I'd rather avoid the extra complexity. As a workaround you could try to add another switch between the server and clients to increase the peer delay. You could also try to lower the priority of the chronyd process to give it more time to get the timestamp. Someone reported it happened only when chronyd was running with a high priority. -- Miroslav Lichvar -- To unsubscribe email [email protected] with "unsubscribe" in the subject. For help email [email protected] with "help" in the subject. Trouble? Email [email protected].
