On Tue, Sep 01, 2020 at 07:58:44PM +0200, Otto Moerbeek wrote: > On Tue, Sep 01, 2020 at 06:33:34PM +0200, Christian Weisgerber wrote: > > > Even after otto@'s commit in -current... > > > > If no replies are received for a while due to connectivity issues > > go into unsynced mode. The existing code to check if we're unsycned > > is only done on receiving an ntp packet which does not happen if > > there are connectivity issues. > > > > ... ntpd can still fail to recognize that a server is unreachable. > > 36 hours after packets to the server have been blocked, ntpd still > > pretends that nothing is wrong: > > > > ----------------------------- > > 1/1 peers valid, clock synced, stratum 2 > > > > peer > > wt tl st next poll offset delay jitter > > fddd:28ee:243:2:213:95ff:fe06:1bb7 ntp > > * 1 10 1 2201s 3296s -0.068ms 1.346ms 0.438ms > > ----------------------------- > > > > > > ===> What works > > > > If I use pf(4) to block outgoing packets on the client (block return > > out proto udp to port ntp), ntpd quickly invalidates the peer. > > > > ktrace of the ntp engine process shows this pattern: > > > > 23799 ntpd CALL sendto(6,0x278c98a1784,0x30,0,0,0) > > 23799 ntpd RET sendto -1 errno 65 No route to host > > ... > > 23799 ntpd CALL recvmsg(6,0x7f7ffffddd60,0) > > 23799 ntpd RET recvmsg -1 errno 61 Connection refused > > > > > > ===> What doesn't work > > > > If I block incoming NTP packets on the gateway to the server (block > > return on $if proto udp to port ntp), ntpd does not notice that the > > peer is unavailable. tcpdump shows the expected ICMP error responses > > to the NTP queries. > > > > ktrace of the ntp engine process shows this pattern: > > > > 23799 ntpd CALL sendto(6,0x278c98a1784,0x30,0,0,0) > > 23799 ntpd RET sendto 48/0x30 > > ... > > 23799 ntpd CALL recvmsg(6,0x7f7ffffddd60,0) > > 23799 ntpd RET recvmsg -1 errno 61 Connection refused > > > > > > Tentative conclusion: ntpd notices when it can't send queries to > > the server, but not when it fails to receive replies. > > > > -- > > Christian "naddy" Weisgerber na...@mips.inka.de > > > > Yeah, this was my next step. client_dispatch() is a bit dumb in the > sense that it always returns 0, if it did something useful or not. Ot > should distinguis between packet processes ok, packet rejected for > some reason and error. > > Note that a complete absense of a reply (with no icmp) does work and > sets the state to unsynced. > > -Otto >
Currently testing this. -Otto Index: client.c =================================================================== RCS file: /cvs/src/usr.sbin/ntpd/client.c,v retrieving revision 1.113 diff -u -p -r1.113 client.c --- client.c 30 Jan 2020 15:55:41 -0000 1.113 +++ client.c 1 Sep 2020 19:15:22 -0000 @@ -456,7 +456,7 @@ client_dispatch(struct ntp_peer *p, u_in if (++p->shift >= OFFSET_ARRAY_SIZE) p->shift = 0; - return (0); + return (1); } int Index: ntp.c =================================================================== RCS file: /cvs/src/usr.sbin/ntpd/ntp.c,v retrieving revision 1.166 diff -u -p -r1.166 ntp.c --- ntp.c 30 Aug 2020 16:21:29 -0000 1.166 +++ ntp.c 1 Sep 2020 19:15:22 -0000 @@ -403,12 +403,9 @@ ntp_main(struct ntpd_conf *nconf, struct for (; nfds > 0 && j < idx_clients; j++) { if (pfd[j].revents & (POLLIN|POLLERR)) { nfds--; - last_action = now; if (client_dispatch(idx2peer[j - idx_peers], - conf->settime, conf->automatic) == -1) { - log_warn("pipe write error (settime)"); - ntp_quit = 1; - } + conf->settime, conf->automatic) == 1) + last_action = now; } }