subject:"\[chrony\-users\] chrony and ntpd xleave interoperability"

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-24 Thread Miroslav Lichvar

On Wed, Jan 24, 2018 at 05:49:01PM +0100, Rob Janssen wrote:
> Miroslav Lichvar wrote:
> > 
> > The bug in the interleaved mode is a bit more subtle. The state is
> > updated from received packet, but only when one of the timestamps is
> > zero (i.e. it's the first packet of the association). This means two
> > ntpd 4.2.8p10 can interoperate, but I suspect the association will not
> > recover if there is a mismatch between the receive timestamps.
> > 
> 
> I have seen problems like that, and stopped using symmetric peering.
> As far as I know, just declaring "server" in each direction works OK (there 
> is loop-detection code)
> and appears a lot more stable.  Probably and debugged tested better.

Yes, the complexity of the symmetric mode is ridiculous when compared
to the client/server mode.

As far as I know the only good use case for the symmetric mode is that
it can be used to push time to a server if it supports ephemeral
associations (chrony does not). I have some stratum-1 servers which
are behind NAT and their address is dynamic, and also some public
servers that are synchronized to them. If the public servers accepted
ephemeral assocations, they could be specified as peers on the
stratum-1 servers and it would work without forwarding ports on the
router and updating a DNS record with the dynamic IP.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-24 Thread Rob Janssen


Miroslav Lichvar wrote:


The bug in the interleaved mode is a bit more subtle. The state is
updated from received packet, but only when one of the timestamps is
zero (i.e. it's the first packet of the association). This means two
ntpd 4.2.8p10 can interoperate, but I suspect the association will not
recover if there is a mismatch between the receive timestamps.



I have seen problems like that, and stopped using symmetric peering.
As far as I know, just declaring "server" in each direction works OK (there is 
loop-detection code)
and appears a lot more stable.  Probably and debugged tested better.

Rob


--
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-24 Thread FUSTE Emmanuel

Le 24/01/2018 à 13:45, Miroslav Lichvar a écrit :
> On Tue, Jan 23, 2018 at 05:42:22PM +0100, FUSTE Emmanuel wrote:
>> Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit :
>>> A similar thing seem to happen when trying to use the interleaved mode
>>> between two 4.2.8p10 ntpds. You said it worked for you before, so I
>>> assume one of the ntpds was an older version which didn't have this
>>> bug?
>> I have a platform with tree ntpds in interleaved mode
>> Was on 2.4.8p8.
>> Were upgraded today to 2.4.8p10 and are still working properly.
> You are right. My test was bad (it hit the bug with unsynchronized
> source).
>
> The bug in the interleaved mode is a bit more subtle. The state is
> updated from received packet, but only when one of the timestamps is
> zero (i.e. it's the first packet of the association). This means two
> ntpd 4.2.8p10 can interoperate, but I suspect the association will not
> recover if there is a mismatch between the receive timestamps.
>
> I'll send a bug report to the ntp maintainers.
>
> In the meantime, if you are willing to patch ntp, this should fix it:
>
> diff -up ntp-4.2.8p10/ntpd/ntp_proto.c.orig ntp-4.2.8p10/ntpd/ntp_proto.c
> --- ntp-4.2.8p10/ntpd/ntp_proto.c.orig2018-01-24 13:35:16.611488502 
> +0100
> +++ ntp-4.2.8p10/ntpd/ntp_proto.c 2018-01-24 13:35:24.113505866 +0100
> @@ -1774,7 +1774,6 @@ receive(
>   peer->bogusorg++;
>   peer->flags |= FLAG_XBOGUS;
>   peer->flash |= TEST2;   /* bogus */
> - return; /* Bogus packet, we are done */
>   }
>   
Yes it work !

Thank you.
Emmanuel.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-24 Thread Miroslav Lichvar

On Tue, Jan 23, 2018 at 05:42:22PM +0100, FUSTE Emmanuel wrote:
> Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit :
> > A similar thing seem to happen when trying to use the interleaved mode
> > between two 4.2.8p10 ntpds. You said it worked for you before, so I
> > assume one of the ntpds was an older version which didn't have this
> > bug?
> I have a platform with tree ntpds in interleaved mode
> Was on 2.4.8p8.
> Were upgraded today to 2.4.8p10 and are still working properly.

You are right. My test was bad (it hit the bug with unsynchronized
source).

The bug in the interleaved mode is a bit more subtle. The state is
updated from received packet, but only when one of the timestamps is
zero (i.e. it's the first packet of the association). This means two
ntpd 4.2.8p10 can interoperate, but I suspect the association will not
recover if there is a mismatch between the receive timestamps.

I'll send a bug report to the ntp maintainers.

In the meantime, if you are willing to patch ntp, this should fix it:

diff -up ntp-4.2.8p10/ntpd/ntp_proto.c.orig ntp-4.2.8p10/ntpd/ntp_proto.c
--- ntp-4.2.8p10/ntpd/ntp_proto.c.orig  2018-01-24 13:35:16.611488502 +0100
+++ ntp-4.2.8p10/ntpd/ntp_proto.c   2018-01-24 13:35:24.113505866 +0100
@@ -1774,7 +1774,6 @@ receive(
peer->bogusorg++;
peer->flags |= FLAG_XBOGUS;
peer->flash |= TEST2;   /* bogus */
-   return; /* Bogus packet, we are done */
}

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread FUSTE Emmanuel

Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit :
> On Tue, Jan 23, 2018 at 02:44:56PM +0100, FUSTE Emmanuel wrote:
>> Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit :
>>> With the current versions, if you can avoid the issue with
>>> unsynchronized sources, they should interoperate, at least when their
>>> polling intervals match. If it doesn't work for you, I'd like to see a
>>> tcpdump output.
>> Ok. I fixed min/max polling interval to 5 for testing purpose.
>> Then I first restarted chrony. Wait for it to sync on a online source.
>> Then restarted ntp and take capture.
>> Will send you all the datas
>>
>> NTP is stuck in unreachable state
>> Chrony is stuck with only one valid RX.
> Ok. I can reproduce this problem. It seems ntpd doesn't update its
> state in the interleaved mode when it receives a packet with an
> unexpected origin timestamp. There was a similar issue fixed for the
> basic mode few ntp releases ago:
> https://bugs.ntp.org/show_bug.cgi?id=2952
>
> As chronyd doesn't switch to the interleaved mode until it's receiving
> valid responses and ntpd doesn't accept responses in the basic mode,
> they are stuck waiting forever on each other.
>
> A similar thing seem to happen when trying to use the interleaved mode
> between two 4.2.8p10 ntpds. You said it worked for you before, so I
> assume one of the ntpds was an older version which didn't have this
> bug?
>
Here are data from the working 4.2.8p10 platform which is composed by 
w.w.w.w, y.y.y.y, z.z.z.z

ind assid status  conf reach auth condition  last_event cnt
===
   1 29450  f414   yes   yes   ok  candidate   reachable  1
   2 29451  f414   yes   yes   ok  candidate   reachable  1
   3 29452  f31f   yes   yes   ok    outlier  1
   4 29453  961a   yes   yes  none  sys.peer    sys_peer  1
   5 29454  931d   yes   yes  none   outlier  1
ntpq> lpe
  remote   refid  st t when poll reach   delay offset  
jitter
==
+x.x.x.x             .MRS.    1 u    5    8  377    0.363 
0.038   0.030
+y.y.y.y              .PTP0.   1 s   25   64  377 0.071    
0.017   0.035
-z.z.z.z              .PTP0.   1 s   45   64  376 0.058    
0.041   0.044
*SHM(0)  .PTP0.   0 l    2    8  377    0.000 -0.017   0.005
-ntp-gps-1.thale .GPS.    1 u    4    8  377    5.031 -0.435   0.020
ntpq> rv 29451
associd=29451 status=f414 conf, authenb, auth, reach, sel_candidate, 1 
event, reachable,
srcadr=y.y.y.y, srcport=123, dstadr=w.w.w.w,
dstport=123, leap=00, stratum=1, precision=-23, rootdelay=0.000,
rootdisp=1.099, refid=PTP0,
reftime=de11e3d4.1850d73b  Tue, Jan 23 2018 17:39:48.094,
rec=de11e3db.18563cd1  Tue, Jan 23 2018 17:39:55.095, reach=376,
unreach=0, hmode=1, pmode=1, hpoll=6, ppoll=6, headway=51, flash=00 ok,
keyid=112, offset=0.017, delay=0.071, dispersion=1.719, jitter=0.035,
xleave=0.024,
filtdelay= 0.09    0.10    0.07    0.12    0.13    0.11 0.11    0.16,
filtoffset=   -0.01   -0.02    0.02    0.06    0.05   -0.01 -0.04    0.00,
filtdisp=  0.00    0.96    1.95    2.94    3.90    4.89 5.88    6.86
ntpq> rv 29452
associd=29452 status=f31f conf, authenb, auth, reach, sel_outlier, 1 
event, interleave_error,
srcadr=z.z.z.z, srcport=123, dstadr=w.w.w.w,
dstport=123, leap=00, stratum=1, precision=-23, rootdelay=0.000,
rootdisp=1.099, refid=PTP0,
reftime=de11e4c0.a5c3751c  Tue, Jan 23 2018 17:43:44.647,
rec=de11e4c7.a5ca043a  Tue, Jan 23 2018 17:43:51.647, reach=377,
unreach=0, hmode=1, pmode=1, hpoll=6, ppoll=6, headway=13, flash=00 ok,
keyid=113, offset=0.041, delay=0.058, dispersion=5.542, jitter=0.062,
xleave=0.014,
filtdelay= 0.11    0.14    0.11    0.11    0.10    0.08 0.06    0.08,
filtoffset=    0.03   -0.05   -0.02   -0.02   -0.03   -0.02 0.04    0.09,
filtdisp=  0.00    0.98    1.92    2.87    3.84    4.83 5.78    6.75

Emmanuel.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread FUSTE Emmanuel

Le 23/01/2018 à 16:58, Miroslav Lichvar a écrit :
> On Tue, Jan 23, 2018 at 02:44:56PM +0100, FUSTE Emmanuel wrote:
>> Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit :
>>> With the current versions, if you can avoid the issue with
>>> unsynchronized sources, they should interoperate, at least when their
>>> polling intervals match. If it doesn't work for you, I'd like to see a
>>> tcpdump output.
>> Ok. I fixed min/max polling interval to 5 for testing purpose.
>> Then I first restarted chrony. Wait for it to sync on a online source.
>> Then restarted ntp and take capture.
>> Will send you all the datas
>>
>> NTP is stuck in unreachable state
>> Chrony is stuck with only one valid RX.
> Ok. I can reproduce this problem. It seems ntpd doesn't update its
> state in the interleaved mode when it receives a packet with an
> unexpected origin timestamp. There was a similar issue fixed for the
> basic mode few ntp releases ago:
> https://bugs.ntp.org/show_bug.cgi?id=2952
>
> As chronyd doesn't switch to the interleaved mode until it's receiving
> valid responses and ntpd doesn't accept responses in the basic mode,
> they are stuck waiting forever on each other.
OK !
> A similar thing seem to happen when trying to use the interleaved mode
> between two 4.2.8p10 ntpds. You said it worked for you before, so I
> assume one of the ntpds was an older version which didn't have this
> bug?
I have a platform with tree ntpds in interleaved mode
Was on 2.4.8p8.
Were upgraded today to 2.4.8p10 and are still working properly.
As in this case i use authent I added authent to the test platform.
Mutual auth validate but the two get stuck as before.

Leap status : Not synchronised
Version : 4
Mode: Symmetric active
Stratum : 0
Poll interval   : 5 (32 seconds)
Precision   : -24 (0.00060 seconds)
Root delay  : 0.00 seconds
Root dispersion : 0.000656 seconds
Reference ID: 494E4954 (INIT)
Reference time  : Thu Jan 01 00:00:00 1970
Offset  : +0.0 seconds
Peer delay  : 0.0 seconds
Peer dispersion : 0.0 seconds
Response time   : 0.0 seconds
Jitter asymmetry: +0.00
NTP tests   : 111 101 
Interleaved : Yes
Authenticated   : Yes
TX timestamping : Hardware
RX timestamping : Hardware
Total TX: 17
Total RX: 18
Total valid RX  : 2

ssocid=3540 status=e011 conf, authenb, auth, sel_reject, 1 event, mobilize,
srcadr=y.y.y.y, srcport=123, dstadr=x.x.x.x,
dstport=123, leap=11, stratum=16, precision=-24, rootdelay=0.000,
rootdisp=0.000, refid=INIT,
reftime=.  Thu, Feb  7 2036  7:28:16.000,
rec=de11e02d.60d2f07f  Tue, Jan 23 2018 17:24:13.378, reach=000,
unreach=10, hmode=1, pmode=0, hpoll=5, ppoll=5, headway=17,
flash=1606 pkt_bogus, pkt_unsync, peer_stratum, peer_dist, peer_unreach,
keyid=1, offset=0.000, delay=0.000, dispersion=15937.500, jitter=0.000,
xleave=0.028,
filtdelay= 0.000.000.000.000.000.000.000.00,
filtoffset=0.000.000.000.000.000.000.000.00,
filtdisp=   16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0


Emmanuel.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread Miroslav Lichvar

On Tue, Jan 23, 2018 at 02:44:56PM +0100, FUSTE Emmanuel wrote:
> Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit :
> > With the current versions, if you can avoid the issue with
> > unsynchronized sources, they should interoperate, at least when their
> > polling intervals match. If it doesn't work for you, I'd like to see a
> > tcpdump output.
> Ok. I fixed min/max polling interval to 5 for testing purpose.
> Then I first restarted chrony. Wait for it to sync on a online source.
> Then restarted ntp and take capture.
> Will send you all the datas
> 
> NTP is stuck in unreachable state
> Chrony is stuck with only one valid RX.

Ok. I can reproduce this problem. It seems ntpd doesn't update its
state in the interleaved mode when it receives a packet with an
unexpected origin timestamp. There was a similar issue fixed for the
basic mode few ntp releases ago:
https://bugs.ntp.org/show_bug.cgi?id=2952

As chronyd doesn't switch to the interleaved mode until it's receiving
valid responses and ntpd doesn't accept responses in the basic mode,
they are stuck waiting forever on each other.

A similar thing seem to happen when trying to use the interleaved mode
between two 4.2.8p10 ntpds. You said it worked for you before, so I
assume one of the ntpds was an older version which didn't have this
bug?

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread FUSTE Emmanuel

Le 23/01/2018 à 13:00, Miroslav Lichvar a écrit :
> On Tue, Jan 23, 2018 at 11:31:38AM +0100, FUSTE Emmanuel wrote:
>> When I try to do the same with ntpd on one side and chrony on the other,
>> things go bad.
>> At best, chrony got a working association with interleave status with
>> very long response time.
> A long response time up to the polling interval of the peer is normal
> in symmetric associations.
>
>> On the ntpd side, the association never work. The chrony server never
>> get the "reach" state and the reach counter is stuck a zero.
> Have you tried the same configuration and the timing of restarts,
> between two ntpd servers? I suspect you would see some of the issues
> in this case too.
>
> There are probably multiple issues involved, which make it difficult
> to see what's going on. I'm aware of the following:
>
> - ntpd doesn't accept packets from peers that are not synchronized
>(yet), so peers have to be configured with other sources in order
>for the symmetric association (in both basic and interleaved modes)
>to start. See https://bugs.ntp.org/show_bug.cgi?id=3445.
> - interleaved mode in ntpd works only when the peers use the same
>polling interval. If they have the same minpoll and maxpoll, but
>minpoll != maxpoll, they should in theory both get to the maxpoll
>if the association doesn't work, but there may be a bug that
>prevents that.
> - chrony switches to the basic mode when the polling intervals don't
>match, but ntpd doesn't accept responses in the basic mode if the
>interleaved mode is enabled
>
>> chrony 3.2
>> ntp-4.2.8p8, ntp-4.2.8p10
>>
>> Could I normally expect xleave interoperability between chrony and ntpd
>> or it is something too much "implementation specific" ?
> With the current versions, if you can avoid the issue with
> unsynchronized sources, they should interoperate, at least when their
> polling intervals match. If it doesn't work for you, I'd like to see a
> tcpdump output.
Ok. I fixed min/max polling interval to 5 for testing purpose.
Then I first restarted chrony. Wait for it to sync on a online source.
Then restarted ntp and take capture.
Will send you all the datas

NTP is stuck in unreachable state
Chrony is stuck with only one valid RX.
>
> Please note that the symmetric mode has some security issues and it's
> generally recommended to use the client/server mode instead. Even if
> authentication is enabled, it is possible to break a symmetric
> association by replaying old packets. (chrony has a partial protection
> against this attack, but it works only in the basic mode when the
> polling intervals match and there are no packets with timestamps from
> future that could be replayed. It's too fragile, don't rely on it!)
Yes I know. It is only used on "trusted" lan segments and/or to try to 
inter-operate with ntpd xleave.
>
> It is possible that support for symmetric associations will be dropped
> from chrony in future.
>
I only using it to transition from ntpd to chrony. So It will not be missed.
I hope my clock vendor will sometime transition from ntpd to something 
else (chrony) to get good xleave support (and much more).
At most, I mainly use theses clocks with PTP so the NTP part only affect 
fail-over scenarios.

Emmanuel.

Re: [chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread Miroslav Lichvar

On Tue, Jan 23, 2018 at 11:31:38AM +0100, FUSTE Emmanuel wrote:
> When I try to do the same with ntpd on one side and chrony on the other, 
> things go bad.
> At best, chrony got a working association with interleave status with 
> very long response time.

A long response time up to the polling interval of the peer is normal
in symmetric associations.

> On the ntpd side, the association never work. The chrony server never 
> get the "reach" state and the reach counter is stuck a zero.

Have you tried the same configuration and the timing of restarts,
between two ntpd servers? I suspect you would see some of the issues
in this case too.

There are probably multiple issues involved, which make it difficult
to see what's going on. I'm aware of the following:

- ntpd doesn't accept packets from peers that are not synchronized
  (yet), so peers have to be configured with other sources in order
  for the symmetric association (in both basic and interleaved modes)
  to start. See https://bugs.ntp.org/show_bug.cgi?id=3445.
- interleaved mode in ntpd works only when the peers use the same
  polling interval. If they have the same minpoll and maxpoll, but
  minpoll != maxpoll, they should in theory both get to the maxpoll
  if the association doesn't work, but there may be a bug that
  prevents that.
- chrony switches to the basic mode when the polling intervals don't
  match, but ntpd doesn't accept responses in the basic mode if the
  interleaved mode is enabled

> chrony 3.2
> ntp-4.2.8p8, ntp-4.2.8p10
> 
> Could I normally expect xleave interoperability between chrony and ntpd 
> or it is something too much "implementation specific" ?

With the current versions, if you can avoid the issue with
unsynchronized sources, they should interoperate, at least when their
polling intervals match. If it doesn't work for you, I'd like to see a
tcpdump output.

Please note that the symmetric mode has some security issues and it's
generally recommended to use the client/server mode instead. Even if
authentication is enabled, it is possible to break a symmetric
association by replaying old packets. (chrony has a partial protection
against this attack, but it works only in the basic mode when the
polling intervals match and there are no packets with timestamps from
future that could be replayed. It's too fragile, don't rely on it!)

It is possible that support for symmetric associations will be dropped
from chrony in future.

-- 
Miroslav Lichvar

-- 
To unsubscribe email chrony-users-requ...@chrony.tuxfamily.org 
with "unsubscribe" in the subject.
For help email chrony-users-requ...@chrony.tuxfamily.org 
with "help" in the subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.

[chrony-users] chrony and ntpd xleave interoperability

2018-01-23 Thread FUSTE Emmanuel

Hello,

First, my apologies for the fingers crossing on chrony-dev when I tried 
to subscribe to chrony-users...

I'm doing some tests to replace ntpd by chrony on some servers groups.
Theses servers use a peer association with interleave option.

When I try to do the same with ntpd on one side and chrony on the other, 
things go bad.
At best, chrony got a working association with interleave status with 
very long response time.
On the ntpd side, the association never work. The chrony server never 
get the "reach" state and the reach counter is stuck a zero.

As soon as I remove  the xleave option on the ntpd side, all start 
immediately to work as expected.

ntpd :
peer y.y.y.y minpoll 5 maxpoll10 xleave
restrict y.y.y.y notrap nomodify noquery

chrony :
peer x.x.x.x xleave minpoll 5  maxpoll 10
allow x.x.x.0/24

Since yesterday, I had removed the xleave option on the ntpd side.
All was good on the two sides.
So I tried to reactivate the xleave option
-> Boom it works !!!

I restarted chrony
-> ntpd logged "revceive: KoD packet from 192.54.145.235 has a zero org 
or rec timestamp. Ignoring."
and four minute later "y.y.y.y 8613 83 unreacheable"
The previously working assoc is now dead.
No working assoc from chrony.

So I restarted ntpd
-> chrony start to see the other server (ntpdata) but never reach a good 
state.
-> ntpd does not reach the "reach" state.

remove the xleave from ntpd and restart
-> all is still stuck
restart chrony
->  ntpd start to see the chrony server, reach state increment, and 
reach a "backup" condition. All is good on the chrony side.

Re-add xleave option on ntpd side.
unreach counter increment, flash=1606 so packet_bogus...
on the chrony side, "Total valid RX" no longer increment...

I'm lost.

chrony 3.2
ntp-4.2.8p8, ntp-4.2.8p10

Could I normally expect xleave interoperability between chrony and ntpd 
or it is something too much "implementation specific" ?

Emmanuel.

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

Re: [chrony-users] chrony and ntpd xleave interoperability

[chrony-users] chrony and ntpd xleave interoperability

10 matches

Site Navigation

Mail list logo

Footer information