Joachim Schrod wrote:
Hello,
I would like to pose a few questions on architecture / best practice
on NTP setups for small and medium companies. I read the documentation
and the Wiki, I also googled, but didn't find a satisfying answer. I'm
willing to update the NTP Wiki with an HOW-TO text that results from
this discussion.
The situation:
-- Let's assume a company with 10 to max. 100 computer systems.
Forthermore, let's just talk about Unix systems, for now, and
let's assume that at least five of them are available.
-- There is only one site. All computers are on the same LAN.
-- The company has only one Internet connection, with a typical SLA of
97.5% availability over the year. (For example, that's the SLA of
T-Com for their basic CompanyConnect product here in Germany.) The
worst-case outage of 9 days can be ignored, but we have to cope
for outages of several hours length.
-- The company has no requirements for extremely accurate
time-synchronization. They run the usual bunch of applications:
Databases, ERP systems, Office systems, and other applications
that access time with a granularity of one second.
I think this is a context that is common to many installations. (Well,
I have seen many such environments. :-) My personal experience is only
in big installations with 1000s of systems and redundant Internet
connections; but I have been asked about such situations a few times
in the past. (The last inquiry was a few days ago, and triggered this
posting.)
The first few questions are about selection of time servers:
How many, and what is their peer structure?
-- I assume that the company should use the NTP server pool, as it's
not a large company with 1000s of computers.
You can use pool servers but I would not use them exclusively.
-- How many timeservers on the LAN that are accessed by clients?
Looking at the available documentation, I would recommend four
servers. (This might mean that many of the Unix systems suddenly
are timeservers.) Or would three be sufficient? One server is
surely not sufficient, as an outage of that server would endager
the whole time synchronization.
I.e., is peering between three servers sufficient to handle outage
of one server until the repair is done, or does one need four servers
to do that properly.
(An answer may depend on the connection of the timeservers to the
pool, as asked in the next question.) The Wiki recommends four
servers, but I have seen several places where three servers are
deemed sufficient. What's best practice?
Ideally, each client should configure four servers.
a. With a single server the client must follow it, right or wrong.
b. Two servers is the worst possible configuration.
c. Three servers degenerates too easily to the two server case.
d. Four servers protect you from the failure of any one server.
-- Connection to the NTP pool:
-- Either all company timeservers access the pool,
-- or one of the timeserver accesses the pool, and the others
synchronizes to it,
-- or there is an additional timeserver that accesses the pool
and the company timeserver synchronize to this special server.
The clients don't use this special server.
Since there is only one Internet connection, and since there are
no separate network paths to the pool servers, I have to ask if
it's still reasonable to have several timeservers synchronizing to
the pool. OTOH, if there is only one pool-connected system, what
to do in case of an outage of that system? (Probably promote one of
the other servers to be the Internet-facing systems.)
I have no idea about further advantages or disadvantages of these
three design possibilities. I assume that this has to be answered
in conjunction with my next question, on peering. (I bar firewall
and DMZ considerations for the moment, that might recommend the
third solution.)
-- Peering: Which servers peer to each other?
-- If all company timeservers access the pool, I think they are
all peers.
Ideally, each server in a peer group should have at least one unique
source of time! Thus four peering servers would need a total of seven
upstream servers.
-- But if only one system accesses the pool, does this system
also peers with the others who synchronize to it? That hasn't
been clear from the documentation. On the Wiki, it says that
one shold peer all timeservers; but also ones that are
different in the stratum hierarchy?
-- Internet connection outages: Just let them happen, or use
undisciplined local clock on stratum 10 as backup on the
timeservers?
AFAIK, undisciplined local clocks can cause havroc when the time
strays too far away from the reference time source. Googling that
question got several potential answers, therefore: is it best
practice to use 127.127.1.0 as a backup for the case that no outside
source of synchronized time is available?
Is there a design decision for the server setup that I missed?
I'm not certain that you need multiple peering servers. For short
outages a server can serve its local clock as "the clock of last
resort". It will drift but should be usable for several hours. Can you
tolerate an error of, say, 100 milliseconds? Everybody would be in
synch but could differ from UTC by up to 100 milliseconds.
It's possible to have a configuration that will survive anything but a
direct hit with a nuclear weapon. Very few organizations need this
level of reliability. I'd be inclined to start small and learn from
experience.
-- Client configuration: Specific servers, or multicast?
Now we have a bunch of timeservers in the company. What is best
practice: That clients are configured to use these servers
specifically, or that multicast mode is used?
Or should one try a manycast configuration?
If one uses multicast or manycast, does this imply that one needs
to establish key-based authentication between servers and clients?
Such small companies usually have no PKI in place, so this might
mean to distribute shared secret keys during setup, or?
Or use Autokey, as explained in the Wiki?
I believe that authentication is not an absolute requirement. All
authentication does is to attempt to guarantee that the servers really
are who they claim to be. If traceability to some standard is a
requirement then you probably need it. If you see no need to guard
against somebody pretending to be someone he's not, you can probably
live without authentication.
Broadcast or multicast will cut down on some of the network chatter.
AFAIK that's the only advantage. The load on the server is not really
significant; a server can handle a thousand clients without working too
hard.
Sorry for the long post. I hope to get some answers, and maybe we can
add those answers to
http://ntp.isc.org/bin/view/Support/DesigningYourNTPNetwork. (Or make a
different page, with specific step-wise explanation for small/medium
company setups.) I think that page is already very good, but would be
further improved with such information.
Best,
Joachim
_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions