> > So, probably not a failure "caused by GPS", rather one caused by poor > design (only two clock sources) combined with unsupported and buggy > devices.
100% correct. From the PDF : 4.31 JT summarised its findings in relation to the ‘Panic Timer’ on the > Cisco IOS XR NTP Client, namely that: JT’s efforts in understanding the > root cause, and mitigation steps to take to avoid future incidents have > focused on the Cisco NTP Client behaviour, and notably Cisco’s decision to > not implement the ‘Panic Timer’ on their IOS XR operating system. Arguably, > whilst the NTP server injected an invalid time into the network, it is the > NTP Clients filtering and selection algorithms which are responsible for > detecting and disregarding falsetickers, and it was the Cisco NTP Clients > failure to appropriately handle this which triggered the network incident. > 43 […] Further detailed soak testing, log analysis and debug analysis > corroborated that the Cisco IOS XR NTP Client did not implement the ‘Panic > Timer’ that would normally cause an NTP Client to ignore an NTP Server > exceeding 1000 seconds variance. On Wed, Aug 16, 2023 at 10:50 AM Mel Beckman <m...@beckman.org> wrote: > So, probably not a failure "caused by GPS", rather one caused by poor > design (only two clock sources) combined with unsupported and buggy > devices. > > > > -mel beckman > > On Aug 16, 2023, at 3:51 AM, Matthew Richardson <matthe...@itconsult.co.uk> > wrote: > > Mel Beckman wrote:- > > Do you have a citation for your Jersey event? I doubt GPS caused the > problem, but I'd like to see the documentation. > > > The event took place on the evening of Sunday 12 July 2020, and seems NOT > to have been due to an issue caused directly by GPS, but rather to > misbehaviour of a GPS NTP server relating to week numbers. Our regulator > subsequently issued the following comprehensive document:- > > > https://www.jcra.je/media/598397/t-027-jt-july-2020-outage-decision-directions.pdf > > By way of summary, JT operated two GPS derived NTP servers, with all of > their routers were pointing to both. On the evening in question, one of > the two reset its clock back to 27 November 2000. > > Their interior routing protocol used amongst their mesh of routers was > IS-IS which was using authentication. The authentication [section 4.19] > was described having a "password validity start date" of 01 July 2012. > Thus, any routers which had picked up the time from the faulty source no > longer had valid IS-IS authentication and were thus isolated. > > Whilst only 15% of their routers were affected, this was enough to cause an > almost total failure in their network, affecting telephony (fixed & mobile) > and Internet. For foreign readers (this is NANOG!) "999" calls refer to > the emergency services in these parts, where any failures attract the > attention of our regulator. > > The details of why the clock "failed" start at section 4.23, and seem to > relate a GPS week number rollover. > > So, probably not a failure "caused by GPS", rather one caused by poor > design (only two clock sources) combined with unsupported and buggy > devices. > > One curious aspect is that some routers followed the "bad" time, which is > alluded to in section 4.31. > > Something not discussed in that report is that JT's email failed during the > incident despite its being hosted on Office365. The reason was that the > two authoritative DNS servers for jtglobal.com were hosted in Jersey > inside > their network. As that network was wholly disconnected, there was no DNS > and hence no email. Despite my having raised this since with their senior > management, their DNS remains hosted in this way:- > > matthew@m88:~$ dig +norec +noedns +nocmd +nostats -t ns jtglobal.com @ > ns1.jtibs.net > > ;; Got answer: > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20462 > > ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 4 > > > ;; QUESTION SECTION: > > ;jtglobal.com. IN NS > > > ;; ANSWER SECTION: > > jtglobal.com. 60 IN NS ns2.jtibs.net. > > jtglobal.com. 60 IN NS ns1.jtibs.net. > > > ;; ADDITIONAL SECTION: > > ns1.jtibs.net. 60 IN A 212.9.0.135 > > ns2.jtibs.net. 60 IN A 212.9.0.136 > > ns1.jtibs.net. 60 IN AAAA 2a02:c28::d1 > > ns2.jtibs.net. 60 IN AAAA 2a02:c28::d2 > > > Rediculously (and again despite my agitation to their management) our > government domain gov.je has similar DNS fragility:- > > matthew@m88:~$ dig +norec +noedns +nocmd +nostats -t ns gov.je @ns1.gov.je > > ;; Got answer: > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4249 > > ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 > > > ;; QUESTION SECTION: > > ;gov.je. IN NS > > > ;; ANSWER SECTION: > > gov.je. 3600 IN NS ns2.gov.je. > > gov.je. 3600 IN NS ns1.gov.je. > > > ;; ADDITIONAL SECTION: > > ns2.gov.je. 3600 IN A 212.9.21.137 > > ns1.gov.je. 3600 IN A 212.9.21.9 > > > -- > Best wishes, > Matthew > > ------ > > From: Mel Beckman <m...@beckman.org> > > To: Matthew Richardson <matthe...@itconsult.co.uk> > > Cc: Nanog <nanog@nanog.org> > > Date: Tue, 8 Aug 2023 15:12:29 +0000 > > Subject: Re: NTP Sync Issue Across Tata (Europe) > > > Until the Internet NTP network can be made secure, no. Do you have a > citation for your Jersey event? I doubt GPS caused the problem, but I'd > like to see the documentation. > > > Using GPS for time sync is simple risk management: the risk of Internet > NTP with known, well documented vulnerabilities and many security > incidents, versus the risk of some theoretical GPS-based vulnerability, for > which mitigations such as geographic diversity are readily available. Sure, > you could use Internet NTP as a last resort should GPS fail globally > (perhaps due to a theoretical - but conceivable - meteor storm). But that > would be a fall-back. I would not mix the systems. > > > -mel > > > On Aug 8, 2023, at 1:36 AM, Matthew Richardson <matthe...@itconsult.co.uk> > wrote: > > > ?Mel Beckman wrote:- > > > It's a problem that has received a lot of attention in both NTP and > > aviation navigation circles. What is hard to defend against is total signal > > suppression via high powered jamming. But that you can do with a > > geographically diverse GPS NTP network. > > > Whilst looking forward to being corrected, GPS (even across multiple > > locations) seems to be a SINGLE source of time. You seem (have I > > misunderstood?) to be a proponent of using GPS exclusively as the external > > clock source. > > > Might it be preferable to have a mixture of GPS (perhaps with another GNSS) > > together with carefully selected Internet-based NTP servers? > > > I recall an incident over here in Jersey (the one they named New Jersey > > after!) where our primary telco had a substantial time shift on one of > > their two GPS synced servers. This managed to adjust the clock on enough > > of their routers that the certificate-based OSPF authentication considered > > the certificates invalid, and caused a failure of almost their whole > > network. > > > This is, of course, not to say that GPS is not a very good clock source, > > but rather to wonder whether more diversity would be preferable than using > > it as a single source. > > > -- > > Best wishes, > > Matthew > > > ------ > > From: Mel Beckman <m...@beckman.org> > > To: "Forrest Christian (List Account)" <li...@packetflux.com> > > Cc: Nanog <nanog@nanog.org> > > Date: Mon, 7 Aug 2023 14:03:30 +0000 > > Subject: Re: NTP Sync Issue Across Tata (Europe) > > > Forrest, > > > GPS spoofing may work with a primitive Raspberry Pi-based NTP server, but > commercial industrial NTP servers have specific anti-spoofing mitigations. > There are also antenna diversity strategies that vendors support to ensure > the signal being relied upon is coming from the right direction. It's a > problem that has received a lot of attention in both NTP and aviation > navigation circles. What is hard to defend against is total signal > suppression via high powered jamming. But that you can do with a > geographically diverse GPS NTP network. > > > -mel > > > On Aug 7, 2023, at 1:39 AM, Forrest Christian (List Account) < > li...@packetflux.com> wrote: > > > ? > > The problem with relying exclusively on GPS to do time distribution is the > ease with which one can spoof the GPS signals. > > > With a budget of around $1K, not including a laptop, anyone with decent > technical skills could convince a typical GPS receiver it was at any > position and was at any time in the world. All it takes is a decent > directional antenna, some SDR hardware, and depending on the location and > directivity of your antenna maybe a smallish amplifier. There is much > discussion right now in the PNT (Position, Navigation and Timing) community > as to how best to secure the GNSS network, but right now one should > consider the data from GPS to be no more trustworthy than some random NTP > server on the internet. > > > In order to build a resilient NTP server infrastructure you need multiple > sources of time distributed by multiple methods - typically both via > satellite (GPS) and by terrestrial (NTP) methods. NTP does a pretty good > job of sorting out multiple time servers and discarding sources that are > lying. But to do this you need multiple time sources. A common > recommendation is to run a couple/few NTP servers which only get time from > a GPS receiver and only serve time to a second tier of servers that pull > from both those in-house GPS-timed-NTP servers and other trusted NTP > servers. I'd recommend selecting the time servers to gain geographic > diversity, i.e. poll NIST servers in Maryland and Colorado, and possibly > both. > > > Note that NIST will exchange (via mail) a set of keys with you to talk > encrypted NTP with you. See > https://www.nist.gov/pml/time-and-frequency-division/time-services/nist-authenticated-ntp-service > . > > > > > On Sun, Aug 6, 2023 at 8:36?PM Mel Beckman <m...@beckman.org<mailto: > m...@beckman.org>> wrote: > > GPS Selective Availability did not disrupt the timing chain of GPS, only > the ephemeris (position information). But a government-disrupted timebase > scenario has never occurred, while hackers are a documented threat. > > > DNS has DNSSec, which while not deployed as broadly as we might like, at > least lets us know which servers we can trust. > > > Your own atomic clocks still have to be synced to a common standard to be > useful. To what are they sync'd? GPS, I'll wager. > > > I sense hand-waving :) > > > -mel via cell > > > On Aug 6, 2023, at 7:04 PM, Rubens Kuhl <rube...@gmail.com<mailto: > rube...@gmail.com>> wrote: > > > ? > > > > On Sun, Aug 6, 2023 at 8:20?PM Mel Beckman <m...@beckman.org<mailto: > m...@beckman.org>> wrote: > > Or one can read recent research papers that thoroughly document the > incredible fragility of the existing NTP hierarchy and soberly consider > their recommendations for remediation: > > > The paper suggests the compromise of critical infrastructure. So, besides > not using NTP, why not stop using DNS ? Just populate a hosts file with all > you need. > > > BTW, the stratum-0 source you suggested is known to have been manipulated > in the past (https://www.gps.gov/systems/gps/modernization/sa/), so you > need to bet on that specific state actor not returning to old habits. > > > OTOH, 4 of the 5 servers I suggested have their own atomic clock, and you > can keep using GPS as well. If GPS goes bananas on timing, that source will > just be disregarded (one of the features of the NTP architecture that has > been pointed out over and over in this thread and you keep ignoring it). > > > Rubens > > > >