On Monday 15 October 2007 16.00:24 Tim Shoppa wrote:

> IIRC, just a couple of years ago with circa 100 servers in the pool ...

Yep, that was my original monitoring system.  The script basically did 
ntpq -pn | grep stuff (well, in Perl, not shell, but you get the idea) and 
interpreted the first character of this output.

> we 
> discovered that ntpd's selection algorithm doesn't work well with a large
> number of servers.

OTOH I didn't know about noselect then.  But I suspect with >1000 servers, 
the CPU overhead of ntpd might become noticeable even with minpoll 12 for 
the servers to be monitored.

We did think about a distributed system (I think we even had three or so 
monitoring ntpd in operation for a short time), but I seen switched to the 
sntp based monitoring system that has more or less survived until today.

Here's my regular microsecond junkie alert: if pool servers are fine to sync 
(by sntp or ntpd) a few million machines, then the project does what it 
should.  Everything else is icing on the cake.

Now the operator of 63.240.161.99 should be notified, but so far such 
servers have been rare enough to be manageable by this manual process - my 
guess is a true ntp based monitoring system is more effort than it's worth.

(Hmmm.  Servers that are often judged falsetickers by ntpd.  Rings a bell.  
Might be an sntp server instead of a ntp server; these are often running a 
piece of software proposed by a group of people making a to-remain-unnamed 
free software operating system with a very liberal license. :-)

Might be just a bad motherboard or network, too.

cheers
-- vbi



-- 
Linden Lab chose Debian Linux because the software is suited to
scaling  massively with a small IT staff.
        -- Linden Lab CTO Cory Ondrejka in InformationWeek

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
timekeepers mailing list
[email protected]
https://fortytwo.ch/mailman/cgi-bin/listinfo/timekeepers

Reply via email to