On 2015-07-01 10:18, Nuno Pereira wrote:
Following last night's leap second, we had some issues with our NTP servers,
especially in a clients with 4 servers configured, but not in clients with 1
source configured.

We have 2 types of configuration (beside the one in the NTP server):

Config 1 (clients with access to the external network):
*       2 NTP servers in the LAN, configured with "iburst prefer";
*       2 external NTP servers, configured with "iburst".

Config 2 (clients without access to the external network):
*       1 NTP server in the LAN, configured with "iburst prefer" or "iburst"
(in this case to "prefer" or not is the same").

The 2 external servers configured had problems with the leap second, having
one second offset after it happen, while the LAN servers got no issues (they
had a leap file, and reported leap_armed within the 24 hours before the
event).

This lead to something like this being reported by "ntpq -p" (don't have
prints):

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
xlan_server_1    160.45.10.8      2 u 1013 1024  377    1.019   -0.483   0.687
xlan_server_2    160.45.10.8      2 u  922 1024  377    1.042   -0.499   0.665
xext_server_1    194.117.9.137    2 u  384 1024  377    3.360 1002.688   0.790
xext_server_2    194.117.9.139    2 u  388 1024  377    3.360 1001.582   0.833

I mean, all 4 were considered false tickers.

In the meanwhile, in the clients where I had no access to the external
network, having only 1 server to sync to (lan_server_1), things worked with no
problem.

From what I've read in this list and in the docs, the best configuration is to
have 4 servers, and that's what's brought by default in the CentOS and Debian
servers, but this issue brought again the even number of servers issue that
can arise with just 2.

How can 4 be worst than 1?

Do I have to go to a 5 servers configuration, in order to avoid this? Or go
for 4 servers in the LAN?

I'm having difficulties to convince my colleagues that we must configure 4
servers (they think that exaggerated), with them thinking that the best is to
have just one, and now I got this issue.

See the select and prefer doc pages.
To get sync, you need a majority clique, with more truechimers than 
falsetickers,
so with two of each, you don't get a majority, and none are considered reliable.
That is why pool servers are recommended as backup with external access, in case
some local sources go down or false.
At least three sources internal or external are preferable to allow a majority
clique even if one source goes down or false; more if you need to allow for
possible network issues.
Also note that prefer means only that source, if it is a survivor, will be used
for system offset and jitter stats, rather than the combine algorithm output.
With more than one surviving preferred source, implementation details decide
which wins.
It is intended for use mainly with local device drivers, as well as to mark a
source to provide seconds numbering for PPS sources.

You may want to consider adding all LAN sources to all clients, add enough LAN
sources to provide an odd number, add pool servers as backup to external 
servers,
and drop prefer from LAN sources to allow the combine algorithm to compute 
stats.

--
Take care. Thanks, Brian Inglis
_______________________________________________
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Reply via email to