Re: [ntp:questions] Q: Disabling 11 minute mode
Serge Bets wrote: On Tuesday, January 22, 2008 at 19:02:23 +, Dean S. Messing wrote: Is it possible to disable 11 minute mode from ntp.conf? No. You have to tweak the kernel. If you have the PPSkit: | $ echo 0 /proc/sys/kernel/time/rtc_update Otherwise you have to patch time.c in the kernel. Dead easy, just a matter of commenting out a line or two. I'm so patching all my kernels, reading and writing the RTC exclusively with hwclock 2.31, and am getting a far better accuracy. The main purpose of an RTC is to initialise the system time at powerup, isn't it? Most people startup in the morning at around half a second of the true time, and later ntpd has to step this to UTC. I routinely startup at some low milliseconds of the true time, offset quickly slewed. My last step event was years ago. Thanks Serge. I looked up PPSkit. Looks good, but I'm going to have to learn how to patch the Fedora kernel to install PPSkit. But I'm discovering that I have rather deeper problems on my machine (a Dell 490 Precision). Using adjtimex --compare to track the drift between system and cmos clock (ntpd not running), I see that the RTC is behaving _very_ strangely. It will begin to return screwy values after several hours of doing adjtimex --compare and then get to the point where hwclcok --show hangs. So my desire to turn off 11 minute mode is mute when ntp is running is mute. For your amusement, here's a snippet of the output of adjtime --compare with an interval of 60 seconds: 1200982902 0.001784 -2.0 10001 3929312 10001 4060301 1200982962 0.0017920.1 10001 3929312 10001 3920719 1200983022 0.0020514.3 10001 3929312 10001 3646240 1200983082 0.001828 -3.7 10001 3929312 10001 4173062 1200983142 0.001756 -1.2 10001 3929312 10001 4007957 1200983202 0.0020254.5 10001 3926656 10001 3632906 1200983261 0.500370 8305.8 10001 39262889918 3549307 120098328140.001689 658355.3 10001 39262883418301130 120098334140.0019314.0 10001 3926288 10001 3661966 120098340734.001894 -10.6 10001 3926288 11001 3966652 120098346140.0016465.9 10001 39262889001 4197121 120098352140.0018904.1 10001 3926288 10001 3659882 120098360912.001763 -48.8 10001 3924640 14668 1878649 120098364140.001606 44.0 10001 39246405334 6280787 1200983741 0.001726 -64.7 10001 3924640 16668 1609118 120098376140.001911 69.8 10001 39246403334 5907090 120098382140.001553 -6.0 10001 3924640 10001 4315525 1200983921 0.001748 -63.4 10001 3924640 16668 1527086 120098394140.001894 69.1 10001 39246403334 5949798 120098400140.001554 -5.7 10001 3924640 10001 4295994 1200984101 0.001700 -64.2 10001 3921488 16668 1577580 1200984161 0.001291 -6.8 10001 3921104 10001 4367718 1200984221 0.0015324.0 10001 3921104 10001 3657823 1200984275 6.001806 14.6 10001 39211049001 3621886 120098430140.001722 55.3 10001 39203684334 6196568 120098436140.0019744.2 10001 3920368 10001 3645108 120098442734.001868 -11.8 10001 3920368 11001 4036253 120098448140.0016796.8 10001 39203689001 4126878 Things got so bad that the output eventually became: 199345540 1001658696.064552 1592732.9 10001 3879376 -5926 1725431 199345717 1001658600.500585 -1592732.8 10001 3879376 25928 6027853 199345718 1001658696.023830 1592054.1 10001 3879376 -5919335126 199345896 1001658600.500586 -1592054.1 10001 3879376 25922868985 199345897 1001658696.045414 1592413.8 10001 3879376 -5923 2975047 Before it went crazy, it had run smoothly for 5 or 6 hours. When I rebooted into the BIOS and looked at the RTC it was off by several years. This has now happened thrice, but only when adjtimex is running in the compare mode for long periods. I have no idea what this means. The cmos battery does not appear to be the problem since, after a reboot, the RTC remains at proper time indefinitely (modulo drift), unless and until I run adjtimex --compare for several hours. Anyway, thanks for the info. on 11 minute mode. Wish I could fix my RTC problem Dean ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
[ntp:questions] Q: Disabling 11 minute mode
Is it possible to disable 11 minute mode from ntp.conf? I've tried using the command disable kernel but that appears to change the way time discipline is maintained, but does nothing for 11 minute mode. If using ntp.conf is not the way, what is? Thanks for your help. Dean ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] Q: Disabling 11 minute mode
Richard B. Gilbert wrote: Dean S. Messing wrote: Is it possible to disable 11 minute mode from ntp.conf? I've tried using the command disable kernel but that appears to change the way time discipline is maintained, but does nothing for 11 minute mode. If using ntp.conf is not the way, what is? Thanks for your help. Dean What IS 11 minute mode?? Oops. Sorry! I thought everyone reading this list (who could answer my question :-) would know. David Woolley gave you a good answer already so I'll only add that if you want to see if you are in 11 minute mode, do adjtimex -p and look at the status: value. If it's odd, (LSB==1) then your kernel is in 11 minute mode. Now, if someone would tell me how to disable it (short of hacking time.c) I'd be most thankful. I tried turning it off with adjtimex -S 64) but ntp changes it back again in a few minutes. I'd like to disable it, but keep ntp kernel discipline so I can do some analysis of my RTC. Dean ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] quirky adjtimex behaviour [SOLVED]
hal-usenet wrote: Dean Messing wrote: I am seeing strange behaviour on my _x86_64 Fedora 7 desktop workstation with regard to the system-cmos time that `adjtimex' reports. snip It seems that leaves two other possibilities: a bug in adjtimex or a bug in the kernel. That's where I am right now. My guess is that the system/kernel is working correctly and that the adjtimex utility is printing out misleading stuff. The CMOS/hardware clock only returns the time to the nearest second. I think that would cause quirks like this if the code has a loop that does a bit of work and sleeps for N seconds and the bit of work takes 0.1 second the time when the CMOS clock is read will drift by 0.1 second each time around the loop. If you want to play and you can find the source, try changing the code that reads the CMOS clock to spin in a loop reading it until it changes. That will give you the time early in the second. Your guess is right, Hal. It's been nearly three weeks since I've had a few minutes to further pursue this. I just replaced version 1.23 of adjtimex with an old version 1.20 and the quirky behaviour disappeared. I first noticed it on my new Fedora 7 with version 1.21. When I looked on the adjtimex site I saw it was up to 1.23 so I thought that surely this problem has been detected and fixed. When it didn't go away in 1.23 I looked elsewhere: 64 bit machine, new kernel, c. I'll write the author and report the bug. I'm really surprised nobody has reported it already. Dean ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
Re: [ntp:questions] quirky adjtimex behaviour
Hi Jan, all. Jan Ceuleers wrote: Dean S. Messing wrote: I am seeing strange behaviour on my _x86_64 Fedora 7 desktop workstation with regard to the system-cmos time that `adjtimex' reports. I've not read your whole post; it's clear that you've been wrestling with this problem for a while and have done quite a bit of work already. Well, I've done what I can but I'm really no expert on this stuff. That's why I wrote to this list, which seems to be populated by _many_ very knowledgeable people. Can I however suggest that you first try and eliminate CPU frequency scaling as a cause of the symptoms you're seeing: use cpufreq-set -g to select a policy that results in a constant CPU frequency and then check if this changes the behaviour (or renders it more predictable). I installed the cpufreq-utils package. The result of `cpufreq-info' is: [EMAIL PROTECTED] ~]# cpufreq-info cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006 Report errors and bugs to [EMAIL PROTECTED], please. analyzing CPU 0: no or unknown cpufreq driver is active on this CPU analyzing CPU 1: no or unknown cpufreq driver is active on this CPU analyzing CPU 2: no or unknown cpufreq driver is active on this CPU analyzing CPU 3: no or unknown cpufreq driver is active on this CPU Also /sys/devices/system/cpu/cpu{0,1,2,3}/cpufreq/ does not exist on this system. I don't know much about cpufreq adjustments. Should I be looking elsewhere? Note that this is a desktop workstation. Will the cpufreq (actually there are four CPUs in two dual-core units) change on such a machine? If you or others wouldn't mind reading my whole original post (it's not _that_ long :-) maybe some other ideas might occur. Thanks. ___ questions mailing list questions@lists.ntp.org https://lists.ntp.org/mailman/listinfo/questions
[ntp:questions] quirky adjtimex behaviour
First, apologies if this is the wrong list for this. Please direct my to the right place if it is. I am seeing strange behaviour on my _x86_64 Fedora 7 desktop workstation with regard to the system-cmos time that `adjtimex' reports. Below is an illustration: [EMAIL PROTECTED] ~]# adjtimex --utc --compare=20 --interval=10 --- current --- -- suggested -- cmos time system-cmos error_ppm tick freqtick freq 1199380337 0.438974 1199380346 0.54042210144.8 10001 395 1199380355 0.64238210196.0 10001 3959899 4212512 1199380364 0.74433610195.4 10001 3959899 4251575 1199380373 0.84529310095.7 10001 3959900 4230787 1199380382 0.94625910096.6 10001 3959900 4172975 1199380392 0.047206 -89905.3 10001 395 10900 4297975 1199380401 0.14916610196.0 10001 3959899 4210950 1199380410 0.25113410196.8 10001 3959899 4160950 1199380419 0.35309610196.2 10001 3959899 4198450 1199380428 0.45505510195.9 10001 3959899 4218762 1199380437 0.55700410194.9 10001 3959899 4284387 1199380446 0.65897310196.9 10001 3959899 4153137 1199380455 0.75992510095.2 10001 3959900 4265162 1199380464 0.86088910096.4 10001 3959900 4185475 1199380473 0.96184810095.9 10001 3959900 4218287 1199380483 0.063806 -89804.2 10001 395 10899 4225012 1199380492 0.16575910195.3 10001 3959899 4257825 1199380501 0.26771910196.0 10001 3959899 4212512 1199380510 0.36968210196.3 10001 3959899 4192200 [EMAIL PROTECTED] ~]# As you can see, the system time appears to advance by almost exactly 0.1 seconds every 10 seconds relative to the RTC. Then, just as they are about to get out of phase by 1 second, something causes either the system clock to jump back by ~1 second or the RTC to jump forward, or so it appears. Furthermore if I change --interval to something odd (like 17) the delta from line to line remains about the same at 0.1 sec, which it should not if there was a real slew occurring: [EMAIL PROTECTED] ~]# adjtimex --utc --compare=10 --interval=17 --- current --- -- suggested -- cmos time system-cmos error_ppm tick freqtick freq 1199380633 0.540237 1199380649 0.642055 5989.3 10001 395 1199380665 0.743996 5996.5 10001 3959941 4177948 1199380681 0.845918 5995.4 10001 3959941 4250558 1199380697 0.947846 5995.8 10001 3959941 4227580 1199380714 0.048774 -52886.6 10001 395 10530 3070784 1199380730 0.150698 5995.5 10001 3959941 4243205 1199380746 0.252637 5996.4 10001 3959941 4185301 1199380762 0.354556 5995.2 10001 3959941 4261588 1199380778 0.456482 5995.6 10001 3959941 4235852 From my investigations, the system time is _not_ advancing faster than UTC. In fact: [EMAIL PROTECTED] ~]# ntpdate -q montpelier.ilan.caltech.edu server 192.12.19.20, stratum 1, offset -0.001267, delay 0.05643 3 Jan 09:24:28 ntpdate[18831]: adjust time server 192.12.19.20 offset -0.00126\ 7 sec so my system is currently about 1 ms ahead the Stratum 1 server, montpelier.ilan.caltech.edu. This offset is nearly constant over several minutes. Also, if I execute the command at several random times over the period of a minute, the offset only fluctuates by 1/2 a ms or so. Conclusion: my system clock is not jumping around. That leaves the RTC doing the jumping. But having an RTC that is runing nearly 1 ppm slower than my system clock and which jumps ahead every 10 seconds seems absurd. In fact two results seem to prove that the RTC is running smoothly: -- If I look at the seconds digit changing w/in the desktop BIOS (pre-boot) it do not change its relative phase w.r.t. the displayed system clock on my laptop screen as I hold the latter next to the BIOS clock. (A 10% second retard would be quite visible, as would the jump. -- The delta from line to line in the `adjtimex' output remains at ~0.1 no matter the --inverval used. This shd. not be the case if there was a true slew between the system clcok and RTC. It seems that leaves two other possibilities: a bug in adjtimex or a bug in the kernel. That's where I am right now. For reference here's a few lines of output from the laptop (running Fedora 6, kernel 2.6.20, and being an i386 32-bit machine): [EMAIL PROTECTED] ~]# adjtimex --utc --compare=20 --interval=17 --- current --- -- suggested -- cmos time system-cmos error_ppm tick