Re: [blfs-dev] NTP
Qrux wrote: > It also wasn't the question I was asking. I run ntpd in daemon mode, > because I want it to keep correcting my time after boot, and that's > where the slewing/stepping behavior is relevant. Yes daemon mode is the script default. > * So, I propose turning -x off. OK, I won't make a special commit for it, but it will be that way with the next bootscript commit. -- Bruce -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
Re: [blfs-dev] NTP
On Thu, 2012-02-16 at 14:13 -0800, Qrux wrote: > On Feb 16, 2012, at 4:38 AM, Matthew Burgess wrote: > > > On Thu, 16 Feb 2012 11:16:12 +, Andrew Benton wrote: > >> On Wed, 15 Feb 2012 18:47:37 -0800 > >> Qrux wrote: > >> > >>> * So, I propose turning -x off. > >> > >> I agree, I run ntpd -g > >> However, I also think the ntpd bootscript will work fine for most > >> people and for those (like me) who think it should be done differently > >> it's trivial to edit the bootscript; your distro, your rules and all > >> that ;) > > > > It probably doesn't affect many LFSers, but Oracle's RAC installation/ > > configuration wizard explicitly checks for '-x' in the ntpd options. > > > > It does this because you really don't want your database server's time > > from jumping backwards, and '-x' (or 'tinker step 0' in /etc/ntp.conf) > > is the only way to guarantee that won't happen. > > In case anyone has forgotten, NTP gives slewing by default. The question is > not whether monotonically increasing time is good. You get that with OR > WITHOUT -x. The issue is, -x doesn't guarantee anything. Good, I'm glad you said that, because that was my understanding from reading the man page as well, which made me wonder why Oracle demands '-x'. Sometimes though, in order to get past their 1st/2nd line support, it's easier to just give in and do what they want rather than what is technically correct :-) > So, getting back to your RAC system...Sure, it can check for it. But let's > hope your database app doesn't stop operating when you can't find a > timesource. In the particular environment that I was directly involved in, our time source was the RAC server's default gateway (or more accurately, the NTP service running on the Cisco switch, which the RAC servers were directly connected to). If they lost their time source, we'd have much bigger issues than their notion of what the correct time was :-) Regards, Matt. -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
Re: [blfs-dev] NTP
On Feb 16, 2012, at 4:38 AM, Matthew Burgess wrote: > On Thu, 16 Feb 2012 11:16:12 +, Andrew Benton wrote: >> On Wed, 15 Feb 2012 18:47:37 -0800 >> Qrux wrote: >> >>> * So, I propose turning -x off. >> >> I agree, I run ntpd -g >> However, I also think the ntpd bootscript will work fine for most >> people and for those (like me) who think it should be done differently >> it's trivial to edit the bootscript; your distro, your rules and all >> that ;) > > It probably doesn't affect many LFSers, but Oracle's RAC installation/ > configuration wizard explicitly checks for '-x' in the ntpd options. > > It does this because you really don't want your database server's time > from jumping backwards, and '-x' (or 'tinker step 0' in /etc/ntp.conf) > is the only way to guarantee that won't happen. Interesting! Sounds like Oracle... As for the issue--I still stand by my original position that defaults should be sensible, and obey "least surprise". Running NTP by default with -x is surprising. I'll leave the 'why' below the fold. Q * * * Technical details follow, for those who are jumping up and down saying: "My app cares about time! So, running NTP with -x protects me!" In case anyone has forgotten, NTP gives slewing by default. The question is not whether monotonically increasing time is good. You get that with OR WITHOUT -x. The issue is, -x doesn't guarantee anything. Man page: "-x, --slew: Slew up to 600 seconds. Normally, the time is slewed...and stepped if above the threshold. This option sets the threshold to 600 s, which is well within the accuracy window to set the clock manually." It simply raises the threshold to 600s, from 128ms. And, in cases where you clock is drifting by more than 10 minutes in the polling interval (and you're saying your app cares about time?) then it wants YOU TO MANUALLY ADJUST THE TIME, before running NTP again. I want to see you do that by hand, and keep things monotonically increasing, especially if you drifted forward. I know...you'll shut down your production machine until those 600 s have elapsed, right? And, in that same situation where you've drifted beyond 600s, if you combine -x with -g, you simply get a big step that doesn't shutdown ntpd--but, the point is, you get a STEP. Lacking the -g, ntpd simply stops itself. I, too, care about time in my apps. So, I've looked into it. And, in the little I know, -x protects nothing. People spend all kinds of time worrying about various other minutiae (MTBF of hard drives, vibration in their systems causing bad feedback on platters, dual-redundant power supplies, etc, etc, etc) and they want absolutely order-dependent mission-critical applications to depend on the same technology that powers their Timex from 1982? No. Real apps that *really* care about time go out of their way to make sure their time hardware is as good as anything else. They get crystal clocks enclosed inside a temperature-controlled, vibration-dampened enclosure with electronic conditioning. And, if they're careful, they use the CO as a *counter*, not as a *clock*. Monotonicity is about counting ticks on a counter, not getting time from a clock. So, -x is not a guarantee. It's a stop-gap, for when your clock (or the environment around your clock) is failing miserably. If you're in a situation where you're drifting for more than 600 seconds in a single polling interval, NTP is going to step you anyway, forward or back. Or, it will simply quit. And let you do it. At which point...What happens in your situation? You shut down your high-volume production machine because you lost access to your timesource? Plus, this is completely missing the point. It's not about whether or not slewing is good. It's about choosing between: * (A) slew beyond 128ms drift * (B) using a kernel discipline The issue is, if you care about timekeeping (Oracle default installs don't give a flying crap), you don't let your clock drift more than (and I'm averaging here), 43 minutes/day. Why 43? NTP already keeps monotonically increasing time by slewing single deltas less than 128ms--and that all happens without -x. 43 minutes is simply the aggregate of the total number of 128ms drifts that NTP can correct BY DEFAULT (i.e., without -x) in a given day. The arithmetic--if you accept the fact that the "typical Unix slew rate is limited to 0.5 ms/s", a 128ms drift will take 256 seconds to amortize. So, if you lose less 128ms every 256 seconds, that's fine, because THE DEFAULT SLEWING WILL TAKE CARE OF YOU. And, 128ms every 256 seconds totals to 43 seconds per day. And, up to that amount of drift, the default slew will take care of it. There is an exception, which is where you get single drifts in a polling interval past 128ms. The default maximum polling interval is 1024 s. Which means your clock would have to have a stability of less than 1 part in
Re: [blfs-dev] NTP
On Thu, 16 Feb 2012 11:16:12 +, Andrew Benton wrote: > On Wed, 15 Feb 2012 18:47:37 -0800 > Qrux wrote: > >> * So, I propose turning -x off. > > I agree, I run ntpd -g > However, I also think the ntpd bootscript will work fine for most > people and for those (like me) who think it should be done differently > it's trivial to edit the bootscript; your distro, your rules and all > that ;) It probably doesn't affect many LFSers, but Oracle's RAC installation/ configuration wizard explicitly checks for '-x' in the ntpd options. It does this because you really don't want your database server's time from jumping backwards, and '-x' (or 'tinker step 0' in /etc/ntp.conf) is the only way to guarantee that won't happen. Interestingly, apparently Dovecot doesn't like time going backwards either; I'm sure there are other servers that prefer a uni-directional arrow of time too. For more 'normal' setups, I'd agree that calling 'ntpd -g -q' to do an initial time sync at bootup, followed by ntpd without any other options would be sufficient; the odds that the ntp pool servers most people use are going to jump backwards are so small, I don't think it's worth guarding against by using the '-x' option by default. Regards, Matt. -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
Re: [blfs-dev] NTP
On Wed, 15 Feb 2012 18:47:37 -0800 Qrux wrote: > * So, I propose turning -x off. I agree, I run ntpd -g However, I also think the ntpd bootscript will work fine for most people and for those (like me) who think it should be done differently it's trivial to edit the bootscript; your distro, your rules and all that ;) Andy -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
Re: [blfs-dev] NTP
On Feb 15, 2012, at 5:00 PM, Bruce Dubbs wrote: > Qrux wrote: >> Is there a reason ntpd is run with -x? >> >> The big slew is nice, but is there a reason it's preferred over the kernel >> discipline? > > When you are booting, there is probably nothing else really depending on > timestamps. Whether or not things run at boot time are sensitive to timestamps is irrelevant. Because, if nothing cares, then it doesn't matter whether you step or slew. It also wasn't the question I was asking. I run ntpd in daemon mode, because I want it to keep correcting my time after boot, and that's where the slewing/stepping behavior is relevant. From the man page for ntpd, about -x: "Slew up to 600 seconds. "Normally, the time is slewed if the offset is less than the step threshold, which is 128 ms by default, and stepped if above the threshold...Since the slew rate of typical Unix kernels is limited to 0.5 ms/s, each second of adjustment requires an amortization interval of 2000s." === If the kernel slew rate is limited to 0.5 ms/s, then your clock had better not drift by more than ~43 seconds/day, because no amount of slew will correct this. So, to me, this is kind of silly. Turning slew up to 600 s is kinda meaningless, unless you can also adjust the slew rate (and I don't see any mention about kconfig parameter to change that). I would bet that a 43 s/d drift is rare on "reasonably current hardware", and that if you're seeing it, you're doing something silly like chaining UPSs or keeping your PCs in a bad thermal environment (clock oscillators are very sensitive to temperature). * Most people probably don't drift by more than 43 s/d. * If they did, -x isn't their solution; it just hides a bigger issue. * Kernel discipline (which -x disables) handles leap-seconds better. * So, I propose turning -x off. In addition, the BLFS ntpd is also run with -g. Long story short, it's better to step "while you can" (i.e., before anything time-sensitive starts, like your application stack with database servers or network authentication servers like LDAP or Kerberos). In fact, the kernel does it anyway, when it loads the "reference time" from the CMOS RTC. Last leap-second was in 2008. The next leap-second was originally scheduled for June 2012. I heard back in Jan that might be postponed. Either way, I think '-x' should not be the default. Q * * * Additional Info * * * Getting back to -x...I guess slewing is fine if you really need a slew of up to 600 seconds, and you have the kernel support to do it. But, why choose that as the default over kernel discipine? The situation where -x would benefit you would be the most rare of situations where you either have a one-time error and could afford a 14d slew, or you see this kind of drift often enough and could adjust the kernel slew rate to deal with it. In fact, if your system needs -x, you probably don't care about good time anyway--or, should be depending on it. If you need to run ntpd with -x, you probably have bigger fish to fry, first. Time discipline is about who gets to discipline the clock, and how. NTP can do it through adjtime()--with microsecond resolution--or through adjtimex() which allows much higher precision (at the cost of portability, since, AFAIK, that system call is only available on Linux & FreeBSD). The latter (adjtimex) requires kernel support. In addition to precision (though, at the cost of slightly lower accuracy due to less algorithmic sophistication), there is another benefit to kernel discipline... There is a time when kernel discipline is better than "always slewing by the default slew rate limit"...Which is during leap seconds. During a leap-second, using a non-kernel discipline and a slow monotonic slew, time will go forward a little bit faster, but very slowly. Which means, using -x, the full extent of that leap second won't be registered until 2000 seconds later. Practically speaking, it comes down to: when the next leap second hits, do you want to be off by over half-a-second over a period of 1000 s, or would you rather have each timestamp to be off by a few dozen microseconds (arguably the difference between the higher-accuracy-NTP discipline, or the kernel's own slightly-less-than-fanatically-accurate discipline). Using the kernel discipline, which can overcome the default slew rate, it will be registered very-near immediately. I would think leap-second-correctness > possibly-absurdly-high-accuracy-that-may-not-matter. You could reframe my original question as: "Why is the BLFS default choice to opt for a possibly-more-accurate time in place of a more-correct-time?" The NTP slewing is maybe more sophisticated than the kernel's. But, it won't handle leap-seconds as well. -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above informa
Re: [blfs-dev] NTP
Qrux wrote: > Is there a reason ntpd is run with -x? > > The big slew is nice, but is there a reason it's preferred over the kernel > discipline? When you are booting, there is probably nothing else really depending on timestamps. We might as well just slew the time to be correct. In most cases, the time offset should be small. -- Bruce -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page
[blfs-dev] NTP
Is there a reason ntpd is run with -x? The big slew is nice, but is there a reason it's preferred over the kernel discipline? Q -- http://linuxfromscratch.org/mailman/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page