Re: [ntp:questions] Support for tickless systems
On Thu, Nov 20, 2014 at 07:27:47AM +, David Taylor wrote: On 19/11/2014 11:56, Miroslav Lichvar wrote: Can you try 3.17 or later and see if it's fixed? Also, it would be interesting to know if adding nohz=off to the kernel command line instead of recompiling works as a workaround too. I found the right file (thanks, Rob, yes there are more options as you say) and tried setting nohz=off but it made no difference - jitter still reported as zero. Interesting. When you tested the kernel compiled without CONFIG_NO_HZ, where ntpd reported non-zero jitter, was that the only difference compared to the original kernel which reported zero jitter? How would I tell whether the nohz=off was actually accepted or not, i.e. how to determine whether the kernel is tickless or not? I'm not sure if there is any reliable way to tell that from user-space, beside parsing the kernel command line. pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time 3:4351879 ARMCTRL BCM2708 Timer Tick pi@raspberrypi ~ $ sleep 10 pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time 3:4353699 ARMCTRL BCM2708 Timer Tick pi@raspberrypi ~ $ I don't know how to interpret the difference of 1820 in those two numbers. The first two commands were typed by hand, by the way, the third with an up-arrow recall. That's between 100 and 250 Hz, so the kernel could be compiled with CONFIG_HZ=100. Do you see that in the kernel config file? Does the interrupt rate change significantly when you load the CPU, e.g. by running cat /dev/urandom /dev/null ? -- Miroslav Lichvar ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On 20/11/2014 09:10, Miroslav Lichvar wrote: On Thu, Nov 20, 2014 at 07:27:47AM +, David Taylor wrote: On 19/11/2014 11:56, Miroslav Lichvar wrote: Can you try 3.17 or later and see if it's fixed? Also, it would be interesting to know if adding nohz=off to the kernel command line instead of recompiling works as a workaround too. I found the right file (thanks, Rob, yes there are more options as you say) and tried setting nohz=off but it made no difference - jitter still reported as zero. Interesting. When you tested the kernel compiled without CONFIG_NO_HZ, where ntpd reported non-zero jitter, was that the only difference compared to the original kernel which reported zero jitter? How would I tell whether the nohz=off was actually accepted or not, i.e. how to determine whether the kernel is tickless or not? I'm not sure if there is any reliable way to tell that from user-space, beside parsing the kernel command line. pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time 3:4351879 ARMCTRL BCM2708 Timer Tick pi@raspberrypi ~ $ sleep 10 pi@raspberrypi ~ $ cat /proc/interrupts | grep -i time 3:4353699 ARMCTRL BCM2708 Timer Tick pi@raspberrypi ~ $ I don't know how to interpret the difference of 1820 in those two numbers. The first two commands were typed by hand, by the way, the third with an up-arrow recall. That's between 100 and 250 Hz, so the kernel could be compiled with CONFIG_HZ=100. Do you see that in the kernel config file? Does the interrupt rate change significantly when you load the CPU, e.g. by running cat /dev/urandom /dev/null ? Miroslav, I have not been able to compile the current kernel. With the previous tests removing the Tickless system option restored non-zero jitter values. http://bugs.ntp.org/show_bug.cgi?id=2314 I don't have a copy of the kernel config file. I guess that /proc/interrupts is a count of interrupts? I didn't realise that. Let me check. Running the sleep 10 sequence from a command procedure gives a difference of 1055, so I guess that's 105.5 interrupts per second. Does sound like 100 Hz, yes. Running the command while another terminal was running cat /dev/urandom /dev/null resulted in 1063 interrupts, so 106.3 Hz. Does that mean I'm tickless or not? -- Cheers, David Web: http://www.satsignal.eu ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote: Running the sleep 10 sequence from a command procedure gives a difference of 1055, so I guess that's 105.5 interrupts per second. Does sound like 100 Hz, yes. Running the command while another terminal was running cat /dev/urandom /dev/null resulted in 1063 interrupts, so 106.3 Hz. Does that mean I'm tickless or not? It seems it's not running in the tickless mode and the problem with zero jitter is caused by something else. Do you have PPS kernel discipline enabled in your ntpd config (flag3) and which driver do you use? The PPS discipline is always disabled when the Linux kernel is compiled with NO_HZ, so I think that could explain what you are seeing. I'm not sure if that would be an ntpd bug or kernel bug, but I can look into it. -- Miroslav Lichvar ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On Thu, Nov 20, 2014 at 12:02:06PM +0100, Miroslav Lichvar wrote: On Thu, Nov 20, 2014 at 10:16:13AM +, David Taylor wrote: Running the sleep 10 sequence from a command procedure gives a difference of 1055, so I guess that's 105.5 interrupts per second. Does sound like 100 Hz, yes. Running the command while another terminal was running cat /dev/urandom /dev/null resulted in 1063 interrupts, so 106.3 Hz. Does that mean I'm tickless or not? It seems it's not running in the tickless mode and the problem with zero jitter is caused by something else. Do you have PPS kernel discipline enabled in your ntpd config (flag3) and which driver do you use? The PPS discipline is always disabled when the Linux kernel is compiled with NO_HZ, so I think that could explain what you are seeing. I'm not sure if that would be an ntpd bug or kernel bug, but I can look into it. After some debugging it seems the problem is that ntpd configured to use the PPS kernel discipline enables it even when the kernel consumer binding failed with the ENOTSUPP error (as would happen with a kernel compiled with NO_HZ). ntpd thinks PPS is running and is using the PPS stats for the clock jitter. This was broken somewhere between ntp-4.2.4 and ntp-4.2.6. I've attached a patch to the ntp bug #2314. -- Miroslav Lichvar ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On 20/11/2014 11:02, Miroslav Lichvar wrote: [] It seems it's not running in the tickless mode and the problem with zero jitter is caused by something else. Do you have PPS kernel discipline enabled in your ntpd config (flag3) and which driver do you use? The PPS discipline is always disabled when the Linux kernel is compiled with NO_HZ, so I think that could explain what you are seeing. I'm not sure if that would be an ntpd bug or kernel bug, but I can look into it. OK, well, this is a start if that's not tickless. The relevant parts of the NTP configuration are: server 127.127.22.0 minpoll 4 maxpoll 4 fudge 127.127.22.0 flag3 1 refid KPPS server 127.127.28.0 minpoll 4 maxpoll 4 fudge 127.127.28.0 time1 0.138 refid GPSD This is using the latest Raspberry Pi Linux kernel which has PPS support for the GPIO pins, ans this is confirmed working with the ppstest command: pi@raspberrypi ~ $ sudo ppstest /dev/pps0 trying PPS source /dev/pps0 found PPS source /dev/pps0 ok, found 1 source(s), now start fetching data... source 0 - assert 1416494463.02094, sequence: 67962 - clear 0.0, sequence: 0 source 0 - assert 1416494464.02311, sequence: 67963 - clear 0.0, sequence: 0 source 0 - assert 1416494465.07529, sequence: 67964 - clear 0.0, sequence: 0 source 0 - assert 1416494466.02747, sequence: 67965 - clear 0.0, sequence: 0 ^Cpi@raspberrypi ~ $ Thanks for your comments on the bug report. -- Cheers, David Web: http://www.satsignal.eu ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On Thu, Nov 20, 2014 at 5:16 AM, David Taylor david-tay...@blueyonder.co.uk.invalid wrote: I don't have a copy of the kernel config file. Back in the day you could say: # zcat /proc/config.gz |grep HZ CONFIG_NO_HZ=y CONFIG_HZ=100 Does that work on your system? -- Paul ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On 20/11/2014 15:25, Paul wrote: On Thu, Nov 20, 2014 at 5:16 AM, David Taylor david-tay...@blueyonder.co.uk.invalid wrote: I don't have a copy of the kernel config file. Back in the day you could say: # zcat /proc/config.gz |grep HZ CONFIG_NO_HZ=y CONFIG_HZ=100 Does that work on your system? -- Paul Yes, it does, Paul: pi@raspi-2 ~ $ sudo zcat /proc/config.gz | grep HZ CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y CONFIG_NO_HZ=y CONFIG_HZ_FIXED=0 CONFIG_HZ_100=y # CONFIG_HZ_200 is not set # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set # CONFIG_HZ_500 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=100 But I am confused by the simultaneous appearance of a CONFIG_HZ value (100) and the apparent N_HZ=y! -- Cheers, David Web: http://www.satsignal.eu ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On Thu, Nov 20, 2014 at 11:46 AM, David Taylor david-tay...@blueyonder.co.uk.invalid wrote: But I am confused by the simultaneous appearance of a CONFIG_HZ value (100) and the apparent N_HZ=y! Because the HZ in NO_HZ leads you to the wrong conclusion. NO_HZ means/meant don't deliver timer/preemption interrupts to idle or single process cpus when there are multiple cpus. The boot cpu always uses HZ_NNN, the other cpus (if there are any -- there aren't on my quasi-i486 box) may or may not depending on process contention. NO_HZ_IDLE is more suggestive. ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Support for tickless systems
On 20/11/2014 18:19, Paul wrote: [] Because the HZ in NO_HZ leads you to the wrong conclusion. NO_HZ means/meant don't deliver timer/preemption interrupts to idle or single process cpus when there are multiple cpus. The boot cpu always uses HZ_NNN, the other cpus (if there are any -- there aren't on my quasi-i486 box) may or may not depending on process contention. NO_HZ_IDLE is more suggestive. OK, sort of. As the Raspberry Pi is a single-core CPU, does that mean that it can never be tickless? -- Cheers, David Web: http://www.satsignal.eu ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions