[ntp:questions] Windows Won't Syncronize to NTP
I can't get a Windows Client to sync to my NTP server. All Linux clients work fine. Here is the info from my ntp.conf: driftfile /var/lib/ntp/drift restrict 127.0.0.1 restrict mask 255.255.224.0 nomodify notrap server 0.rhel.pool.ntp.org iburst server 1.rhel.pool.ntp.org iburst server 2.rhel.pool.ntp.org iburst Here is the tcpdump and ntpd -d info: TCP DUMP OUTPUT: 13:02:19.440054 IP .ntp > .ntp: NTPv3, symmetric active, length 48 ntpd -d output receive: at 447 cent.ntp<-winclient mode 1 code 5 auth 0 transmit: at 447 cent.ntp->winclient mode 1 IPTables is off for testing Any suggestions? Thanks ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
[ntp:questions] high precision tracking: trying to understand sudden jumps
Hello, I'm trying to configure a small network for high precision time. Recently acquired an Endrun CDMA time server that runs like a dream, tracking CDMA time to about +/- 5 microseconds. The clients are a rag-tag assembly of diverse systems including a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. All are configured to prefer the Endrun clock and poll it on a 16 second interval. All are attached to a single SMC gigabit Ethernet switch with only the Endrun and two Sun systems running at a lower speed of 100 MBPS. Close to zero network traffic and system loads. All systems are running 'ntpd' 4.2.4p4. Compiled NTP native 64-bit for the Windows X64 system. [A #ifdef tweak to 'intptr_t' and 'uintptr_t' is required, will provide patch if desired]. It generally is working well, with the systems tracking anywhere from +/- 100 microseconds to +/- 500 microseconds most of the time. However once or twice a day, all the systems experience a random, uncorrelated time shift of from one to several milliseconds. Had an issue where a UPS voltage correction shift and cheap power supply on the Windows X64 box appeared to be a problem, but that was fixed by configuring the UPS to consider 110V nominal instead of 120V. Does anyone have any ideas about what could be causing these random time jumps and what might be done to eliminate them? Something I'm planning to try is to make sure that 'mlock' is configured in the daemons--presently 'autoconf' has left it disabled for some reason. However I don't belive page faults are the culprit. All the daemons are running at the highest real-time priority in the respective systems. The above configuration is a controlled lab setup. The next target is a stack eight of DELL 1950 servers in a production data center running Windows 2003 R2 and slaved to a newer Endrun time server. Don't have useful data from these systems yet because the network jitter is outrageous. Working with the network admin to hopefully have the NTP traffic to and from the Endrun clock bypass level 3 switch/router rule checking. They have large, complex router ACL rulesets I suspect as the cause of the jitter. Attached are fairly representative graphs of the offset and frequency for two of the lab servers. Thanks P.S. Resent without graphs as the list mailer says they're not allowed. Happy to send them or the raw 'loopstats' to anyone interested. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
[EMAIL PROTECTED] wrote: > Hello, > > I'm trying to configure a small network for high precision time. > Recently acquired an Endrun CDMA time server that runs like > a dream, tracking CDMA time to about +/- 5 microseconds. > > The clients are a rag-tag assembly of diverse systems including > a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, > IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. > > All are configured to prefer the Endrun clock and poll it on a > 16 second interval. All are attached to a single SMC gigabit > Ethernet switch with only the Endrun and two Sun systems running > at a lower speed of 100 MBPS. Close to zero network traffic > and system loads. > > All systems are running 'ntpd' 4.2.4p4. Compiled NTP native > 64-bit for the Windows X64 system. [A #ifdef tweak to > 'intptr_t' and 'uintptr_t' is required, will provide patch if > desired]. > > It generally is working well, with the systems tracking anywhere > from +/- 100 microseconds to +/- 500 microseconds most of the > time. > > However once or twice a day, all the systems experience a > random, uncorrelated time shift of from one to several > milliseconds. Forcing the poll interval to 16 seconds is not always a good idea! Ntpd will select a poll interval, generally starting at 64 seconds, and ramping up to as long as 1024 seconds as the clock is beaten into submission! Directly connected refclocks are frequently polled at shorter intervals but I don't think your refclock is "directly connected" in the same sense that a clock working through a serial or parallel port is directly connected! A clock connected via ethernet with all the latencies and jitter thereunto appertaining is no different than any other network server and should be polled in the same manner! The very short poll intervals correct large errors quickly and the very long intervals correct small errors very accurately! ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
[EMAIL PROTECTED] writes: >Hello, >I'm trying to configure a small network for high precision time. >Recently acquired an Endrun CDMA time server that runs like >a dream, tracking CDMA time to about +/- 5 microseconds. No idea what CDMa time is, but that does not matter. Do you have peerstats running on the various machines so you can look at the raw offset and particularly the round trip times? It may be that your network one way is suddenly delaying things for mseconds one way for half an hour say. >The clients are a rag-tag assembly of diverse systems including >a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, >IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. >All are configured to prefer the Endrun clock and poll it on a >16 second interval. All are attached to a single SMC gigabit >Ethernet switch with only the Endrun and two Sun systems running >at a lower speed of 100 MBPS. Close to zero network traffic >and system loads. Maybe that ethernet switch suffers a nervous breakdown (too little to do?) once a day. >All systems are running 'ntpd' 4.2.4p4. Compiled NTP native >64-bit for the Windows X64 system. [A #ifdef tweak to >'intptr_t' and 'uintptr_t' is required, will provide patch if >desired]. >It generally is working well, with the systems tracking anywhere >from +/- 100 microseconds to +/- 500 microseconds most of the >time. Should be within 10s of usec, not hundreds. >However once or twice a day, all the systems experience a >random, uncorrelated time shift of from one to several >milliseconds. Had an issue where a UPS voltage correction shift >and cheap power supply on the Windows X64 box appeared to be a >problem, but that was fixed by configuring the UPS to consider >110V nominal instead of 120V. >Does anyone have any ideas about what could be causing these >random time jumps and what might be done to eliminate them? >Something I'm planning to try is to make sure that 'mlock' is >configured in the daemons--presently 'autoconf' has left it >disabled for some reason. However I don't belive page >faults are the culprit. All the daemons are running at >the highest real-time priority in the respective systems. >The above configuration is a controlled lab setup. The next >target is a stack eight of DELL 1950 servers in a production >data center running Windows 2003 R2 and slaved to a newer Endrun >time server. Don't have useful data from these systems yet I would have just used a cheap GPS receiver, not pay $700 for one of these, but it's your money. Ah, just looked at their web page. Would I really believe that the CDMA cell phone network would care if their time signal were accurate to usec? There is no time path correction. But you should see that on your server connected to the device. Anyway, look at the peerstats file, esp the roundtrip times and the offsets. The ntp clock-filter tries to compensate for vast variations in these but can only do so much. >because the network jitter is outrageous. Working with the >network admin to hopefully have the NTP traffic to and from the >Endrun clock bypass level 3 switch/router rule checking. They >have large, complex router ACL rulesets I suspect as the cause >of the jitter. Sounds a bit weird. On an ADSL link from home through the telco to the university, I get better than 1ms time accuracy. >Attached are fairly representative graphs of the offset and >frequency for two of the lab servers. Netnews is text only. Post the info on a web page where anyone can look at it. >Thanks >P.S. Resent without graphs as the list mailer says >they're not allowed. Happy to send them or the raw >'loopstats' to anyone interested. Just post them. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
[EMAIL PROTECTED] wrote: > The clients are a rag-tag assembly of diverse systems including > a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, > IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. How are you interpolating the 16ms ticks on the Windows system? How are you disabling power management on the lap top? > > It generally is working well, with the systems tracking anywhere > from +/- 100 microseconds to +/- 500 microseconds most of the > time. How are you measuring the difference from true time? In principle, if ntpd can measure it, it will correct it. > > However once or twice a day, all the systems experience a > random, uncorrelated time shift of from one to several > milliseconds. Had an issue where a UPS voltage correction shift In which direction is the slip? Backward only slips against true time (these might appear as forward slips if the real error is in the server) are typically due to lost clock interrupts. If that is the case it implies you are using a tick rate of other than 100Hz. Please note that the Linux kernel code is broken for clock frequencies other than 100Hz and the use of 1000Hz significantly increases the likelihood of a lost interrupt. The normal source of lsot interrupts is disk drivers using programmed transfers. > and cheap power supply on the Windows X64 box appeared to be a > problem, but that was fixed by configuring the UPS to consider > 110V nominal instead of 120V. > ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Windows Won't Syncronize to NTP
"Matthew Lind" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I can't get a Windows Client to sync to my NTP server. > All Linux clients work fine. [...] > TCP DUMP OUTPUT: > > 13:02:19.440054 IP .ntp > .ntp: NTPv3, > symmetric active, length 48 Don't do that. W32Time is asking to be a peer, which it has absolutely no business to. There have been recent posts (last two weeks or so) about how to add a byte at the end of the name to request a less abusive relationship with the NTP server. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
"Richard B. Gilbert" <[EMAIL PROTECTED]> writes: >[EMAIL PROTECTED] wrote: >> Hello, >> >> I'm trying to configure a small network for high precision time. >> Recently acquired an Endrun CDMA time server that runs like >> a dream, tracking CDMA time to about +/- 5 microseconds. >> >> The clients are a rag-tag assembly of diverse systems including >> a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, >> IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. >> >> All are configured to prefer the Endrun clock and poll it on a >> 16 second interval. All are attached to a single SMC gigabit >> Ethernet switch with only the Endrun and two Sun systems running >> at a lower speed of 100 MBPS. Close to zero network traffic >> and system loads. >> >> All systems are running 'ntpd' 4.2.4p4. Compiled NTP native >> 64-bit for the Windows X64 system. [A #ifdef tweak to >> 'intptr_t' and 'uintptr_t' is required, will provide patch if >> desired]. >> >> It generally is working well, with the systems tracking anywhere >> from +/- 100 microseconds to +/- 500 microseconds most of the >> time. >> >> However once or twice a day, all the systems experience a >> random, uncorrelated time shift of from one to several >> milliseconds. > >Forcing the poll interval to 16 seconds is not always a good idea! >Ntpd will select a poll interval, generally starting at 64 seconds, and >ramping up to as long as 1024 seconds as the clock is beaten into >submission! It is his network, he is not going to overload it. So, if he wants a 16 sec poll interval that is up to him. I agree it is not a good idea for remote servers, but on his own system it is fine. >Directly connected refclocks are frequently polled at shorter intervals >but I don't think your refclock is "directly connected" in the same >sense that a clock working through a serial or parallel port is directly >connected! >A clock connected via ethernet with all the latencies and jitter >thereunto appertaining is no different than any other network server and >should be polled in the same manner! ??? The longer polls are in order not to swamp the remote server whith 1 people all polling every 16 sec ( or 1 sec) There is nothing in ntp itself that mandates a longer poll interval. In fact a shorter poll interval makes ntp much more responsive to changes ( clock drifts, etc) >The very short poll intervals correct large errors quickly and the very >long intervals correct small errors very accurately! No for a properly designed system both should be corrected. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
"Unruh" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > [...] Would I really believe that the CDMA cell phone network > would care if their time signal were accurate to usec? I would. Because IIUC, this is the basis on which they divide timeslots between stations. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
David Woolley <[EMAIL PROTECTED]> writes: >[EMAIL PROTECTED] wrote: >> The clients are a rag-tag assembly of diverse systems including >> a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, >> IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. >How are you interpolating the 16ms ticks on the Windows system? How are >you disabling power management on the lap top? >> >> It generally is working well, with the systems tracking anywhere >> from +/- 100 microseconds to +/- 500 microseconds most of the >> time. >How are you measuring the difference from true time? In principle, if >ntpd can measure it, it will correct it. I expect that he means the offsets that ntp measures. NTP does NOT correct random offsets. Ie, if there is noise source which makes the offsets vary by 500usec ntp will not get rid of them. You will see them in the offsets as measured by ntp. Now, the time keeping might (or might not) be more accurate than that, but those offsets are what I suspect he means. >> >> However once or twice a day, all the systems experience a >> random, uncorrelated time shift of from one to several >> milliseconds. Had an issue where a UPS voltage correction shift >In which direction is the slip? Backward only slips against true time >(these might appear as forward slips if the real error is in the server) >are typically due to lost clock interrupts. If that is the case it >implies you are using a tick rate of other than 100Hz. Please note that >the Linux kernel code is broken for clock frequencies other than 100Hz >and the use of 1000Hz significantly increases the likelihood of a lost >interrupt. He claims on all the systems. >The normal source of lsot interrupts is disk drivers using programmed >transfers. Almost all disk drives on Linux now use dma. >> and cheap power supply on the Windows X64 box appeared to be a >> problem, but that was fixed by configuring the UPS to consider >> 110V nominal instead of 120V. >> ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
"Unruh" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > "Richard B. Gilbert" <[EMAIL PROTECTED]> writes: >> Forcing the poll interval to 16 seconds is not always a good idea! >> Ntpd will select a poll interval, generally starting at 64 seconds, >> and ramping up to as long as 1024 seconds as the clock is beaten >> into submission! > > It is his network, he is not going to overload it. So, if he wants a > 16 sec poll interval that is up to him. > I agree it is not a good idea for remote servers, but on his own system > it is fine. [...] > ??? The longer polls are in order not to swamp the remote server whith > 1 people all polling every 16 sec ( or 1 sec) There is nothing in > ntp itself that mandates a longer poll interval. In fact a shorter poll > interval makes ntp much more responsive to changes ( clock drifts, etc) >> The very short poll intervals correct large errors quickly and the >> very long intervals correct small errors very accurately! > > No for a properly designed system both should be corrected. You seem to be missing the point. Once the large errors have been corrected, NTP goes on to the small errors. For that, it _needs_ a longer poll interval. That this gives the server more air is a happy coincidence, but not why it does it. Given the measurement error, you need to let the small error accumulate over a longer period. Otherwise it would simply be lost in the noise. Do the math: assume the (constant!) measurement error to be +/- 1 ms, the frequency error in my local host to be 1000 PPM (1/1000). With a 1 s polling interval, the real value is 1 ms and the measurement will be between 0 and 2 ms. Not very good. With a 1000 s polling interval, the real value is 1 s and the measurement will be between 0.999 and 1.001 s. Now that's useful to correct your clock with. Now use more realistic numbers, like 50 PPM to start with, a polling interval of 64 s and I'm not exactly sure what for the measuring jitter. But the gist should be clear: that 50 PPM will go down, the SNR will worsen, and the polling interval should go up to improve it again. Starting with a short interval is good to correct large errors quickly. Backing off once you've done so is good to avoid pestering the server, but it's also good to correct small errors accurately, and _that_ is why it's done. And of course, once a larger than expected offset is measured, the polling interval is shortened again. Groetjes, Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
Unruh wrote: > "Richard B. Gilbert" <[EMAIL PROTECTED]> writes: > > >>[EMAIL PROTECTED] wrote: >> >>>Hello, >>> >>>I'm trying to configure a small network for high precision time. >>>Recently acquired an Endrun CDMA time server that runs like >>>a dream, tracking CDMA time to about +/- 5 microseconds. >>> >>>The clients are a rag-tag assembly of diverse systems including >>>a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80, >>>IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop. >>> >>>All are configured to prefer the Endrun clock and poll it on a >>>16 second interval. All are attached to a single SMC gigabit >>>Ethernet switch with only the Endrun and two Sun systems running >>>at a lower speed of 100 MBPS. Close to zero network traffic >>>and system loads. >>> >>>All systems are running 'ntpd' 4.2.4p4. Compiled NTP native >>>64-bit for the Windows X64 system. [A #ifdef tweak to >>>'intptr_t' and 'uintptr_t' is required, will provide patch if >>>desired]. >>> >>>It generally is working well, with the systems tracking anywhere >>>from +/- 100 microseconds to +/- 500 microseconds most of the >>>time. >>> >>>However once or twice a day, all the systems experience a >>>random, uncorrelated time shift of from one to several >>>milliseconds. >> >> > > >>Forcing the poll interval to 16 seconds is not always a good idea! >>Ntpd will select a poll interval, generally starting at 64 seconds, and >>ramping up to as long as 1024 seconds as the clock is beaten into >>submission! > > > It is his network, he is not going to overload it. So, if he wants a 16 sec > poll interval that is up to him. > I agree it is not a good idea for remote servers, but on his own system it > is fine. > > > >>Directly connected refclocks are frequently polled at shorter intervals >>but I don't think your refclock is "directly connected" in the same >>sense that a clock working through a serial or parallel port is directly >>connected! > > >>A clock connected via ethernet with all the latencies and jitter >>thereunto appertaining is no different than any other network server and >>should be polled in the same manner! > > > ??? The longer polls are in order not to swamp the remote server whith > 1 people all polling every 16 sec ( or 1 sec) There is nothing in ntp > itself that mandates a longer poll interval. In fact a shorter poll > interval makes ntp much more responsive to changes ( clock drifts, etc) > > > > >>The very short poll intervals correct large errors quickly and the very >>long intervals correct small errors very accurately! > > > No for a properly designed system both should be corrected. > > If you don't measure across a long interval, you will never see some of those small errors. When you measure across 1024 seconds you overwhelm the network jitter. The long interval is part of the design for just that reason. Suppose your frequency error is 5 PPM or 0.43 seconds per day. Do you think you can measure that error accurately with a 64 second poll interval? If you are working over the internet, an error that small is going to disappear in the jitter. It will be sixteen times more obvious at the longer interval. You can poll a hardware reference clock at 16 second intervals because the network is not involved! The latency and jitter a PPS signal over a serial port are an order or two of magnitiude less than what you get over a busy network. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
Unruh wrote: > 1 people all polling every 16 sec ( or 1 sec) There is nothing in ntp > itself that mandates a longer poll interval. In fact a shorter poll > interval makes ntp much more responsive to changes ( clock drifts, etc) As I understand it, locking maxpoll low only slightly improves responsiveness. The main effect is simply to oversample, as the time constants still adjust to values appropriate to a poll interval of 1024s. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
>However once or twice a day, all the systems experience a >random, uncorrelated time shift of from one to several >milliseconds. What does that mean? I'm guessing that "uncorrelated" means the glitches don't happen at the same time. Are all clients seeing occasional problems? Do they match cron jobs or some activity burst on the system? Can you try another network switch? Or maybe even run without any switches? (plug the CDMA box directly into a second ethernet port) Can you try another NTP server? How about setting up a PC, letting it run for a day to establich a good drift file, and then making it run on the local clock only. That will drift, slowly, but there won't be any jumps. How about adding another client that doesn't do anything? (Turn off cron too.) -- These are my opinions, not necessarily my employer's. I hate spam. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
Unruh wrote: >> > I expect that he means the offsets that ntp measures. NTP does NOT correct I suspect that too. > random offsets. Ie, if there is noise source which makes the offsets vary It averages them so as to reduce their effective size. > by 500usec ntp will not get rid of them. You will see them in the offsets > as measured by ntp. Now, the time keeping might (or might not) be more > accurate than that, but those offsets are what I suspect he means. The question is about "measured errors" that significantly exceed the random offsets. In any case the systematic error can also greatly exceed the measured offset - that represents an error that ntpd cannot measure. > > > Almost all disk drives on Linux now use dma. They need to do both and the drivers that caused this problem were capable of using DMA. The problem was, I believe, that certain chipsets were unsafe with DMA, so the default, at least used to be, the unconditional one of doing programmed transfers; you could enable DMA at your own risk. My impression is that there are still enough systems with lost disk interrupts that someone reporting one tick backward steps can reasonably be assumed to have that problem, and it is a reasonable probability for someone who doesn't report the direction of the step. The other common cause of steps, which are balanced in both directions, is not applicable here. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
Maarten Wiltink wrote: > > You seem to be missing the point. Once the large errors have been > corrected, NTP goes on to the small errors. For that, it _needs_ a > longer poll interval. That this gives the server more air is a > happy coincidence, but not why it does it. I don't believe it *needs* longer poll intervals; I think they are simply wasteful in that the offsets are low pass filtered in such a way that clamping maxpoll makes very little difference to the result, when the time constant goes high. I'm not sure that there is any user configurable option that actually does what people think they are doing by locking down maxpoll, in terms of keeping the loop time constant low. A clamped maxpoll may improve the reponsiveness to faults causing time steps of more than 128ms, but one should be attacking the problem, not the symptom. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Windows won't Sync to NTP server
David Woolley wrote: > [EMAIL PROTECTED] wrote: >> I can't get a Windows Client to sync to my NTP server. All Linux >> clients work fine. > > You didn't say that you were running a non-NTP compliant version of > w32time on the Windows system (it's illegally using symmetric active). > > It is possible that your version of ntpd does not have the workaround > for the w32time bug that was extensively discussed last week. You > should try setting the options on w32time that causes it to generate > proper client associations, upgrading to Windows 2003 (which is reported > to be compliant). Alternatively, you could run the reference ntpd on the > Windows systems. ntp 4.2.4p4 does not include that fix nor do any of the tarballs for ntp-dev yet. That fix is coming. Martin Burnicki or Ryan Malayter provided instructions on how to get w32time to send client packet instead of symmetric active packets. The clients are getting synched because the restrict statement is denying peers. Danny ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
On Sun, 30 Mar 2008, [EMAIL PROTECTED] wrote: > At 04:51 PM 3/30/2008 -0700, Bill Unruh wrote: >> Are those on the same day? > > Yes, same day. Uncorrelated to anything I can identify > or each other. Same story on all the boxes. Running > a hefty multi-system compile with heavy NFS and Samba > traffic does not produce these events, though it disturbs > the Windows boxes slightly when CPU goes to 100%. > >> Which "linux" and which "windows" are those graphs since you >> have 2 linux and 2 windows clients. > > That's the dual-core AMD 2.4GHz Athlon Tyan mobo whitebox > runing Centos 4.5 SMP kernel. Similar results on the > Dell Dimension 2400 2.4GHz Intel P4 running Centos 4.5 > mono-processor kernel. > > Windows is a dual-core 3.4GHz Pentium D Tyan mobo whitebox > running 2003 R2 SP2 standard server. > >> As I said, seeing the >> peerstats files would be helpful (offset and roundtrip) > > Might try them later, but I can't belive a high-quality > SMC switch is causing multi-millisecond delays. Just not > possible. Pings are all about 400 microseconds, consistent > but slightly different on each system. Round trip is > 800 microseconds. Attaching the output from a bulk 'ntpq -p' > 'ntptrace' script I have below. Note that's 'ntptrace' > version 4.1 since the 4.2 script has useless offset info. I have had weird latencies on some switches here. And since all your machines are experiencing this, that switch is the only commonality (or the ntp server). Do you have the peerstats on the server as well to make sure that there are not some weird delays there. > >> Also these graphs seem to have cut off the spikes. Are the >> spikes actaully higher or is that an illusion? > > Higher. Sometimes 1ms, sometimes 5-6ms. > >> (Note the spikes are hundreds of usec, not many msec) > > That would be the ~1ms example, check out the other one. > I am also really really really disturbed that you have so many servers. You are trying to test out one specific server. The others are simply liable to confuse everything. For example ntp could for some bizarre reason, suddenly decide to use one of those other sites as the preferred server and give a glitch. And what are all those CDMA servers? Set your system up with one single source, the one you want to test. > > > > > remote refid st t when poll reach delay offset jitter > == > Endrun CDMA > LOCAL(0)LOCAL(0)10 l 18 64 3770.0000.000 0.015 > *HOPF_S(0) .CDMA. 0 l6 16 3770.0000.000 0.015 > Centos 32 > *eachna .CDMA. 1 u3 16 3770.683 -0.004 0.009 > -tock.usno.navy. .USNO. 1 u 452 1024 377 20.6781.432 2.822 > +navobs1.wustl.e .GPS.1 u 479 1024 377 50.136 -1.513 0.164 > +time.nist.gov .ACTS. 1 u 471 1024 377 66.528 -1.708 0.156 > -tick.ucla.edu .GPS.1 u 432 1024 377 87.3723.296 0.085 > Ultra 10 > *172.29.87.3 .CDMA. 1 u 11 16 3770.869 -0.016 0.042 > 172.29.87.15: stratum 2, offset -0.07, synch distance 0.00783 > 172.29.87.3: stratum 1, offset -0.18, synch distance 0.00038, refid 'CDMA' > Ultra 80 > *172.29.87.3 .CDMA. 1 u4 16 3770.942 -0.012 0.012 > 172.29.87.17: stratum 2, offset -0.38, synch distance 0.00685 > 172.29.87.3: stratum 1, offset -0.17, synch distance 0.00038, refid 'CDMA' > 44p > *172.29.87.3 .CDMA. 1 u 13 16 3770.809 -0.001 0.016 > 172.29.87.13: stratum 2, offset -0.14, synch distance 0.00627 > 172.29.87.3: stratum 1, offset -0.18, synch distance 0.00038, refid 'CDMA' > Centos 64 > *172.29.87.3 .CDMA. 1 u 12 16 3770.6640.003 0.487 > 172.29.87.19: stratum 2, offset -0.09, synch distance 0.00720 > 172.29.87.3: stratum 1, offset -0.18, synch distance 0.00038, refid 'CDMA' > W2K3 64 > *172.29.87.3 .CDMA. 1 u4 16 3770.7340.053 0.014 > 172.29.87.20: stratum 2, offset -0.60, synch distance 0.00650 > 172.29.87.3: stratum 1, offset -0.19, synch distance 0.00038, refid 'CDMA' > XP 32 laptop > *172.29.87.3 .CDMA. 1 u7 16 3770.8190.468 0.256 > 172.29.87.12: stratum 2, offset -0.000173, synch distance 0.00655 > 172.29.87.3: stratum 1, offset -0.17, synch distance 0.00038, refid 'CDMA' > -- William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273 Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324 UBC, Vancouver,BC | Program in Cosmology | [EMAIL PROTECTED] Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/ ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] high precision tracking: trying to understand sudden jumps
"Maarten Wiltink" <[EMAIL PROTECTED]> writes: >"Unruh" <[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] >> "Richard B. Gilbert" <[EMAIL PROTECTED]> writes: >>> Forcing the poll interval to 16 seconds is not always a good idea! >>> Ntpd will select a poll interval, generally starting at 64 seconds, >>> and ramping up to as long as 1024 seconds as the clock is beaten >>> into submission! >> >> It is his network, he is not going to overload it. So, if he wants a >> 16 sec poll interval that is up to him. >> I agree it is not a good idea for remote servers, but on his own system >> it is fine. >[...] >> ??? The longer polls are in order not to swamp the remote server whith >> 1 people all polling every 16 sec ( or 1 sec) There is nothing in >> ntp itself that mandates a longer poll interval. In fact a shorter poll >> interval makes ntp much more responsive to changes ( clock drifts, etc) >>> The very short poll intervals correct large errors quickly and the >>> very long intervals correct small errors very accurately! >> >> No for a properly designed system both should be corrected. >You seem to be missing the point. Once the large errors have been >corrected, NTP goes on to the small errors. For that, it _needs_ a >longer poll interval. That this gives the server more air is a >happy coincidence, but not why it does it. I have no idea what this means. ntp simply runs a second order feedback network It does not do anything for "large and small" errors. >Given the measurement error, you need to let the small error >accumulate over a longer period. Otherwise it would simply be >lost in the noise. No idea what you mean. >Do the math: assume the (constant!) measurement error to be +/- 1 ms, >the frequency error in my local host to be 1000 PPM (1/1000). With a >1 s polling interval, the real value is 1 ms and the measurement >will be between 0 and 2 ms. Not very good. With a 1000 s polling >interval, the real value is 1 s and the measurement will be between >0.999 and 1.001 s. Now that's useful to correct your clock with. You are not talking about large and small errors, you aree talking about phase and frequency errors. And no computer has fixed eitehr phase of frequency errors. They keep changing. Thus integrating for a longer time does not help if the frequency errors ( drift) keeps changing. >Now use more realistic numbers, like 50 PPM to start with, a polling >interval of 64 s and I'm not exactly sure what for the measuring >jitter. But the gist should be clear: that 50 PPM will go down, the >SNR will worsen, and the polling interval should go up to improve it >again. ??? What you are descibing in one of the key problems with the ntp algorithm. >Starting with a short interval is good to correct large errors >quickly. Backing off once you've done so is good to avoid pestering >the server, but it's also good to correct small errors accurately, >and _that_ is why it's done. And of course, once a larger than >expected offset is measured, the polling interval is shortened >again. Anyway, that is not his problem. He is getting ms spikes in the loopfilter. Those wipe out anything else he does. It destroys all attempts by ntp to discipline the clock. >Groetjes, >Maarten Wiltink ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions