Bug#336230: marked as done (NTPD not working after Debian upgrade (V3.0 -> V3.1))

Debian Bug Tracking System Tue, 10 Apr 2007 13:26:29 -0700

Your message dated Tue, 10 Apr 2007 22:15:59 +0200
with message-id <[EMAIL PROTECTED]>
and subject line Bug#336230: NTPD not working after Debian upgrade (V3.0 -> 
V3.1)
has caused the attached Bug report to be marked as done.


This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---

Package: ntp-server
Version: 1:4.2.0a+stable-2sarge1
architecture: i386

activy:~# uname -a
Linux activy 2.4.27-2-686 #1 Mon May 16 17:03:22 JST 2005 i686 GNU/Linux

activy:~# ls -l /lib/libc.so.6
lrwxrwxrwx  1 root root 13 Oct 10 07:24 /lib/libc.so.6 -> libc-2.3.2.so


Installed ntp packages: ntp, ntp-doc, ntp-server, ntp-simple, ntpdate.

PROBLEM: after system upgrade, the ntpd starts in 2 instances a boot
time, but both die within 1-2 minutes.

From "ps fax":

  907 ?        SLs    0:00 ntpd
912 ? S 0:00 \_ ntpd


The process numbers in the following log extract are not matching, as
both snapshots habe been taken on different occasions ...

Log example:

19 Oct 06:32:42 ntpd[425]: frequency initialized 79.130 PPM from 
/var/lib/ntp/ntp.drift
19 Oct 06:32:52 ntpd[458]: signal_no_reset: signal 17 had flags 4000000
19 Oct 06:32:54 ntpd[458]: signal_no_reset: signal 14 had flags 4000000
19 Oct 06:33:24 ntpd[458]: parent died before we finished, exiting
20 Oct 06:36:20 ntpd[425]: frequency initialized 79.130 PPM from 
/var/lib/ntp/ntp.drift
20 Oct 06:36:33 ntpd[458]: signal_no_reset: signal 17 had flags 4000000
20 Oct 06:36:35 ntpd[458]: signal_no_reset: signal 14 had flags 4000000
20 Oct 06:37:05 ntpd[458]: parent died before we finished, exiting


From deamon log file:

Oct 20 19:40:54 activy ntpd[907]: ntpd [EMAIL PROTECTED]:4.2.0a+stable-2-r Fri 
Aug 26 10:30:12 UTC 2005 (1)
Oct 20 19:40:54 activy ntpd[907]: signal_no_reset: signal 13 had flags 4000000
Oct 20 19:40:54 activy ntpd[907]: precision = 2.000 usec
Oct 20 19:40:54 activy ntpd[907]: Listening on interface wildcard, 0.0.0.0#123
Oct 20 19:40:54 activy ntpd[907]: Listening on interface lo, 127.0.0.1#123
Oct 20 19:40:54 activy ntpd[907]: Listening on interface eth0, 
192.168.192.77#123
Oct 20 19:40:54 activy ntpd[907]: kernel time sync status 0040


Applying "strace -f" on the startup script delivers (tail of output
only, as it was very, very long):

[pid   573] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   573] sigreturn()                 = ? (mask now [RTMIN])
[pid   573] gettimeofday({1130269798, 250610}, NULL) = 0
[pid   573] gettimeofday({1130269798, 251333}, NULL) = 0
[pid   573] gettimeofday({1130269798, 252103}, NULL) = 0
[pid   573] gettimeofday({1130269798, 253088}, NULL) = 0
[pid   573] time(NULL)                  = 1130269798
[pid   573] write(7, "53668 71398.253 127.127.1.0 9014"..., 81) = 81
[pid   573] select(7, [4 5 6], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be 
restarted)
[pid   573] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   573] sigreturn()                 = ? (mask now [RTMIN])
[pid   573] select(7, [4 5 6], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be 
restarted)
[pid   573] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   573] sigreturn()                 = ? (mask now [RTMIN])
[pid   573] gettimeofday({1130269800, 250588}, NULL) = 0
[pid   573] sendto(6, "\343\0\6\366\0\0\0\0\0\0\0\6INIT\0\0\0\0\0\0\0\0\0\0\0"..., 48, 0, 
{sa_family=AF_INET, sin_port=htons(123), sin_addr=inet_addr("161.53.30.3")}, 16) = 48
[pid   573] select(7, [4 5 6], NULL, NULL, NULL) = 1 (in [6])
[pid   573] gettimeofday({1130269800, 322973}, NULL) = 0
[pid   573] select(7, [4 5 6], NULL, NULL, {0, 0}) = 1 (in [6], left {0, 0})
[pid   573] recvfrom(6, "$\2\6\353\0\0\1\222\0\0\v:\2415\1\2\307\t\10\252\231Y\255"..., 
1092, 0, {sa_family=AF_INET, sin_port=htons(123), sin_addr=inet_addr("161.53.30.3")}, 
[16]) = 48
[pid   573] select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid   573] gettimeofday({1130269800, 326593}, NULL) = 0
[pid   573] gettimeofday({1130269800, 327440}, NULL) = 0
[pid   573] gettimeofday({1130269800, 328233}, NULL) = 0
[pid   573] time(NULL)                  = 1130269800
[pid   573] write(7, "53668 71400.328 161.53.30.3 9014"..., 82) = 82
[pid   573] select(7, [4 5 6], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be 
restarted)
[pid   573] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid   573] sigreturn()                 = ? (mask now [RTMIN])
[pid   573] gettimeofday({1130269801, 249125}, NULL) = 0
[pid   573] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 573 detached
--- SIGALRM (Alarm clock) @ 0 (0) ---
<... rt_sigsuspend resumed> )           = -1 EINTR (Interrupted system call)
alarm(30)                               = 0
sigreturn()                             = ? (mask now [RTMIN])
getppid()                               = 1
time(NULL)                              = 1130269825
getpid()                                = 574
write(8, "25 Oct 21:50:25 ntpd[574]: paren"..., 67) = 67
munmap(0x40019000, 4096)                = 0
exit_group(0)                           = ?

Process 574 detached


The first process [573] suffers from a segmentation fault, causing the
second [574] to die also. Again process numbers do not match the
preceeding examples.

   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +

During system upgrade I had the old configuration files requested to
remain in effect. Now I suspected that they might not match the new
version and attempted the minimal configuration file which aptitude had
written as a backup file - and it worked !!!

Next steps were to isolate the erraneous config commands - but not a
single command line for itself seems to be wrong. At first I suspected
'special commands' like peer, restrict, broadcast - but none of them failed.
   So the (long) list of server commands got under suspicion: I
expected the one of the servers might send malicious data in order to
kill clients. Again, not a single one could be proven as guilty.

   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +

It seems to be a QUANTITY PROBLEM, according to my tests up to now.

The following is my current /etc/ntp.conf in the state it is working:

activy:~# cat /etc/ntp.conf
# /etc/ntp.conf, configuration for ntpd

# ntpd will use syslog() if logfile is not defined
logfile /var/log/ntpd

driftfile /var/lib/ntp/ntp.drift
statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable


#       Zeitserverliste:

##      Stratum-11 (lokal)
peer   192.168.192.9            # P100 gleichwertig
server 192.168.192.34
server 192.168.192.88
server 192.168.192.7

##      Stratum-2-Server:
#server st.ntp.carnet.hr        # HR
#server time.ijs.si             # SI
server  biofiz.mf.uni-lj.si     # SI
server  ntp2.tuxfamily.net      # FR
#server ntp.univ-lyon1.fr       # FR
#server ntp1.pucpr.br           # BR
#server ntp2.contactel.cz       # CZ
server  ntp.karpo.cz            # CZ
#server ntp.doubleukay.com      # MY
#server fartein.ifi.uio.no      # NO
server  tock.keso.fi            # FI
#server sign.chg.ru             # RU
#server ntp.psn.ru              # RU
#server clock.cimat.ues.edu.sv  # SV
#server ntp.saard.net           # AU
#server timelord.uregina.ca     # CA
#server ntp3.cs.wisc.edu        # US
#server tock.nml.csir.co.za     # ZA
server  ntp4.uni-augsburg.de    # DE

##      Stratum-1-Server:
# server        ntps1-2.uni-erlangen.de
# server        ntp2.fau.de     # Uni Erlangen
# server        ntp3.fau.de
# server        ntp2.ptb.de     # PTB Braunschweig
# server        ntp1.ptb.de
# server        tick.usno.navy.mil

# pool.ntp.org maps to more than 100 low-stratum NTP servers.
# Your server will pick a different set every time it starts up.
#  *** Please consider joining the pool! ***
#  ***  <http://www.pool.ntp.org/#join>  ***
server          pool.ntp.org
#server pool.ntp.org
## uncomment for extra reliability

# ... and use the local system clock as a reference if all else fails
# NOTE: in a local network, set the local stratum of *one* stable server
# to 10; otherwise your clocks will drift apart if you lose connectivity.
server 127.127.1.0              # local clock (LCL)
fudge  127.127.1.0 stratum 13   # LCL is unsynchronized


##      Zugriffsrechte:

# By default, exchange time with everybody, but don't allow configuration.
# See /usr/share/doc/ntp-doc/html/accopt.html for details.
restrict default kod notrap nomodify nopeer noquery

# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1 nomodify

# Clients from this subnet have unlimited access,
# but only if cryptographically authenticated
#restrict 192.168.192.0  mask  255.255.255.0 notrust
# LAN-Rechner werden unverschlüsselt bedient, dürfen aber nicht ändern:
restrict 192.168.192.0  mask  255.255.255.0 kod notrap nomodify
# P450 darf alles:
restrict 192.168.192.7  mask  255.255.255.255


##      Broadcast:

# If you want to provide time to your local subnet, change the next line.
broadcast       192.168.192.255 # fuer LAN

# If you want to listen to time broadcasts on your local subnet,
# de-comment the next lines. Please do this only if you trust everybody
# on the network!
#disable auth
#broadcastclient


The "ps fax" contains the following 2 lines for ntpd:

  428 ?        SLs    0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid
  453 ?        S      0:00  \_ /usr/sbin/ntpd -p /var/run/ntpd.pid


   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +   + + +

If I only add 1 additional server (by deleting the comment # at the
beginning of the line in the config file), the daemon crashes soon after
having been started - no matter whether started by boot or manually.

I have no explanation for this behaviour. Is there a new limit for the
number of servers, which I eventually overlooked in the documentation,
or is it a real bug?

There is a bug report underhttp://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316242 showing the samesymptom. But the explanation does not fit to my case as I do not have asingle IPV6-address.


--

Regards,

 -----------------
 Eberhard Spittler
 [ http://spittler.name/ ]

--- End Message ---

--- Begin Message ---
Bug is obsolete.
--- End Message ---

Bug#336230: marked as done (NTPD not working after Debian upgrade (V3.0 -> V3.1))

Reply via email to