Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 12, 2017 at 10:26:12AM -0700, John Stultz wrote: > On Fri, May 12, 2017 at 8:14 AM, Miroslav Lichvarwrote: > > I see this with real PHCs and PTP/NTP synchronization too. It's very > > confusing when the timekeeping changes so much for no apparent reason. > > If we can't remove the old vsyscalls yet, I was thinking maybe a new > > flag could be added to adjtimex to report the error, so applications > > can at least detect this problem and consider stepping the clock in > > order to reset the error? > > > > Thoughts? > > I'd rather not have short-term hacks that applications have to adapt. > So I think we should drop the old vsyscall method in the near term. > Sorry this sort of fell off my radar. Ok. Sounds good. > Do you have an updated set of patches you want to get ready to address > the issue? We can get those reviewed while we increase the pressure on > dropping the OLD_VSYSCALL implementations. I'll send an RFC series shortly. Thanks, -- Miroslav Lichvar
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 12, 2017 at 10:26:12AM -0700, John Stultz wrote: > On Fri, May 12, 2017 at 8:14 AM, Miroslav Lichvar wrote: > > I see this with real PHCs and PTP/NTP synchronization too. It's very > > confusing when the timekeeping changes so much for no apparent reason. > > If we can't remove the old vsyscalls yet, I was thinking maybe a new > > flag could be added to adjtimex to report the error, so applications > > can at least detect this problem and consider stepping the clock in > > order to reset the error? > > > > Thoughts? > > I'd rather not have short-term hacks that applications have to adapt. > So I think we should drop the old vsyscall method in the near term. > Sorry this sort of fell off my radar. Ok. Sounds good. > Do you have an updated set of patches you want to get ready to address > the issue? We can get those reviewed while we increase the pressure on > dropping the OLD_VSYSCALL implementations. I'll send an RFC series shortly. Thanks, -- Miroslav Lichvar
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 12, 2017 at 8:14 AM, Miroslav Lichvarwrote: > On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: >> On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: >> > I spent some time trying to figure out a workaround for the nanosecond >> > rounding, but I didn't find anything that wouldn't complicate the mult >> > adjustment logic and bring back the problems which the direct division >> > approach is supposed to solve. >> > >> > It seems it may be a while before the old vsyscalls are fixed. How >> > about including only the first two patches from this set for now? > >> So thanks for the ping here. If you're happy with the first two as an >> initial step, I'd be willing to try to push those in. The only trouble >> is there's a whole lot of timekeeping churn headed for 3.17 that Thomas >> has cooked up. While there isn't likely to be direct conflicts in the >> changes, I get nervous about mixing too many changes in subtle code at once. > > I'm sorry for digging up this skeleton. Are we any closer to being > able to remove CONFIG_GENERIC_TIME_VSYSCALL_OLD, which prevented > simplifying the steering logic of the internal clock? Yea. I think we've waited for a few years w/o action on this from the ppc and ia64 folks. Probably time to put a compile warning in making it clear its going to be removed in the next release or two to force the issue. > With the new PTP KVM clock the problem can be easily observed. Here is > a graph of the offset and frequency as measured by chronyd when > configured to synchronize the guest's clock to the host using the > virtual PHC. In the middle is the point when the NTP error reached > zero. The apparent frequency jumped by about 50 ppb and the offset > improved by an order of magnitude. > > https://mlichvar.fedorapeople.org/tmp/kvm_phc.png > > I see this with real PHCs and PTP/NTP synchronization too. It's very > confusing when the timekeeping changes so much for no apparent reason. > If we can't remove the old vsyscalls yet, I was thinking maybe a new > flag could be added to adjtimex to report the error, so applications > can at least detect this problem and consider stepping the clock in > order to reset the error? > > Thoughts? I'd rather not have short-term hacks that applications have to adapt. So I think we should drop the old vsyscall method in the near term. Sorry this sort of fell off my radar. Do you have an updated set of patches you want to get ready to address the issue? We can get those reviewed while we increase the pressure on dropping the OLD_VSYSCALL implementations. thanks -john
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 12, 2017 at 8:14 AM, Miroslav Lichvar wrote: > On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: >> On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: >> > I spent some time trying to figure out a workaround for the nanosecond >> > rounding, but I didn't find anything that wouldn't complicate the mult >> > adjustment logic and bring back the problems which the direct division >> > approach is supposed to solve. >> > >> > It seems it may be a while before the old vsyscalls are fixed. How >> > about including only the first two patches from this set for now? > >> So thanks for the ping here. If you're happy with the first two as an >> initial step, I'd be willing to try to push those in. The only trouble >> is there's a whole lot of timekeeping churn headed for 3.17 that Thomas >> has cooked up. While there isn't likely to be direct conflicts in the >> changes, I get nervous about mixing too many changes in subtle code at once. > > I'm sorry for digging up this skeleton. Are we any closer to being > able to remove CONFIG_GENERIC_TIME_VSYSCALL_OLD, which prevented > simplifying the steering logic of the internal clock? Yea. I think we've waited for a few years w/o action on this from the ppc and ia64 folks. Probably time to put a compile warning in making it clear its going to be removed in the next release or two to force the issue. > With the new PTP KVM clock the problem can be easily observed. Here is > a graph of the offset and frequency as measured by chronyd when > configured to synchronize the guest's clock to the host using the > virtual PHC. In the middle is the point when the NTP error reached > zero. The apparent frequency jumped by about 50 ppb and the offset > improved by an order of magnitude. > > https://mlichvar.fedorapeople.org/tmp/kvm_phc.png > > I see this with real PHCs and PTP/NTP synchronization too. It's very > confusing when the timekeeping changes so much for no apparent reason. > If we can't remove the old vsyscalls yet, I was thinking maybe a new > flag could be added to adjtimex to report the error, so applications > can at least detect this problem and consider stepping the clock in > order to reset the error? > > Thoughts? I'd rather not have short-term hacks that applications have to adapt. So I think we should drop the old vsyscall method in the near term. Sorry this sort of fell off my radar. Do you have an updated set of patches you want to get ready to address the issue? We can get those reviewed while we increase the pressure on dropping the OLD_VSYSCALL implementations. thanks -john
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: > On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: > > I spent some time trying to figure out a workaround for the nanosecond > > rounding, but I didn't find anything that wouldn't complicate the mult > > adjustment logic and bring back the problems which the direct division > > approach is supposed to solve. > > > > It seems it may be a while before the old vsyscalls are fixed. How > > about including only the first two patches from this set for now? > So thanks for the ping here. If you're happy with the first two as an > initial step, I'd be willing to try to push those in. The only trouble > is there's a whole lot of timekeeping churn headed for 3.17 that Thomas > has cooked up. While there isn't likely to be direct conflicts in the > changes, I get nervous about mixing too many changes in subtle code at once. I'm sorry for digging up this skeleton. Are we any closer to being able to remove CONFIG_GENERIC_TIME_VSYSCALL_OLD, which prevented simplifying the steering logic of the internal clock? The two patches added in 3.17 helped, but there is still the problem that it takes too long for the kernel clock to converge to the internal NTP time when a large error was accumulated, e.g. a large frequency adjustment was made to correct the initial offset of the clock. It seems the error can reach milliseconds and take hours or even days to be corrected. As the rate at which the error is decreasing is not constant (due to random clock updates), it doesn't seem to be possible to synchronize the clock with better stability than few tens or hundreds of nanoseconds. With the new PTP KVM clock the problem can be easily observed. Here is a graph of the offset and frequency as measured by chronyd when configured to synchronize the guest's clock to the host using the virtual PHC. In the middle is the point when the NTP error reached zero. The apparent frequency jumped by about 50 ppb and the offset improved by an order of magnitude. https://mlichvar.fedorapeople.org/tmp/kvm_phc.png I see this with real PHCs and PTP/NTP synchronization too. It's very confusing when the timekeeping changes so much for no apparent reason. If we can't remove the old vsyscalls yet, I was thinking maybe a new flag could be added to adjtimex to report the error, so applications can at least detect this problem and consider stepping the clock in order to reset the error? Thoughts? -- Miroslav Lichvar
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: > On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: > > I spent some time trying to figure out a workaround for the nanosecond > > rounding, but I didn't find anything that wouldn't complicate the mult > > adjustment logic and bring back the problems which the direct division > > approach is supposed to solve. > > > > It seems it may be a while before the old vsyscalls are fixed. How > > about including only the first two patches from this set for now? > So thanks for the ping here. If you're happy with the first two as an > initial step, I'd be willing to try to push those in. The only trouble > is there's a whole lot of timekeeping churn headed for 3.17 that Thomas > has cooked up. While there isn't likely to be direct conflicts in the > changes, I get nervous about mixing too many changes in subtle code at once. I'm sorry for digging up this skeleton. Are we any closer to being able to remove CONFIG_GENERIC_TIME_VSYSCALL_OLD, which prevented simplifying the steering logic of the internal clock? The two patches added in 3.17 helped, but there is still the problem that it takes too long for the kernel clock to converge to the internal NTP time when a large error was accumulated, e.g. a large frequency adjustment was made to correct the initial offset of the clock. It seems the error can reach milliseconds and take hours or even days to be corrected. As the rate at which the error is decreasing is not constant (due to random clock updates), it doesn't seem to be possible to synchronize the clock with better stability than few tens or hundreds of nanoseconds. With the new PTP KVM clock the problem can be easily observed. Here is a graph of the offset and frequency as measured by chronyd when configured to synchronize the guest's clock to the host using the virtual PHC. In the middle is the point when the NTP error reached zero. The apparent frequency jumped by about 50 ppb and the offset improved by an order of magnitude. https://mlichvar.fedorapeople.org/tmp/kvm_phc.png I see this with real PHCs and PTP/NTP synchronization too. It's very confusing when the timekeeping changes so much for no apparent reason. If we can't remove the old vsyscalls yet, I was thinking maybe a new flag could be added to adjtimex to report the error, so applications can at least detect this problem and consider stepping the clock in order to reset the error? Thoughts? -- Miroslav Lichvar
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: > On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: > > It seems it may be a while before the old vsyscalls are fixed. How > > about including only the first two patches from this set for now? > > Hey! Sorry for the slow response here, I was on the road the last two weeks. > > So thanks for the ping here. If you're happy with the first two as an > initial step, I'd be willing to try to push those in. Ok, thanks. > Thomas: Would you be ok if I push you some patches to improve ntp/ptp > freq steering w/ NOHZ for this merge window as well? So far the patches > have only shown nice improvements and no regressions, but its clearly > subtle code, and there's a lot going on in this merge window. For the record, here are test results from the simulator. There is a regression in the performance with disabled nohz, it takes longer for the clock to converge, but I think it's acceptable. When compared on the same time scale, disabled nohz still seems to be significantly better than nohz enabled. 3.16-rc5: $ ./test1.sh freq10 freq100 dev max nohz on 38.383682.72579 1468940.9 9846788.2 nohz off3.89181 0.10436 0.2 0.6 3.16-rc5 + two patches from this set: $ ./test1.sh freq10 freq100 dev max nohz on 1.13791 0.05155 85.0313.7 nohz off1.02364 0.05149 0.5 1.7 Here are results from a new script I added to linux-tktest, which I think shows better how the clock is converging. 3.16-rc5: $ ./test2.sh nohz on [1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 32.08360 16039.4 23435.5 39.61782 11077.4 30938.0 30 13.84304 35033.8 80321.5 8.52682 21495.6 52639.6 100 0.68945 31395.7 86405.6 1.26818 26426.9 98448.1 300 0.18118 33906.1 104062.0 0.11419 21147.4 102418.1 1000 0.02699 32254.1 120619.4 0.02234 21222.3 120107.6 3000 0.00515 31210.4 131688.9 0.00525 23067.8 140999.4 10.00077 31904.1 149537.0 0.00079 21688.2 144957.5 30.00015 31422.9 158176.5 0.00015 21867.9 159288.5 10 0.3 31301.3 170099.5 0.3 22265.5 168985.6 nohz off[1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 6.64956 14.0 18.8 7.10428 2.4 5.1 30 2.87697 13.1 33.0 0.02541 0.1 0.4 100 0.38990 9.9 35.1 0.00502 0.2 0.5 300 0.04733 6.5 42.9 0.00096 0.2 0.5 1000 0.00437 3.7 46.2 0.00023 0.2 0.6 3000 0.00049 2.2 47.1 0.3 0.2 0.6 10.4 1.2 47.5 0.0 0.2 0.6 30.1 0.8 47.6 0.0 0.2 0.6 10 0.0 0.5 47.6 0.0 0.2 0.6 3.16-rc5 + two patches from this set: $ ./test2.sh nohz on [1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 1.472221341.32217.8 0.06322 0.2 0.5 30 0.20799 849.52448.7 0.06311 0.2 0.6 100 0.04101 492.12895.2 0.06311 0.2 0.5 300 0.05660 295.53026.1 0.02064 28.3 108.9 1000 0.01994 409.82732.1 0.00355 13.7 52.2 3000 0.00477 469.13238.9 0.00070 11.0 40.9 10.00081 377.33791.6 0.00013 9.4 36.2 30.00016 259.94055.7 0.4 8.9 34.1 10 0.3 159.04177.2 0.0 13.7 58.4 nohz off[1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 3.55062 6.2 10.8 0.05730 0.0 0.0 30 0.44672 4.5 14.1 0.05724 0.2 0.5 100 0.03649 2.7 17.4 0.05711 0.2 0.5 300 0.05815 1.7 18.7 0.06313 0.2 0.5 1000 0.06270 1.0 19.1 0.06315 0.2 0.5 3000 0.05720 1.9 19.9 0.02065 1.1 4.1 10.01947 13.5 41.0 0.00339 0.5 1.7 30.00448 17.5 75.9 0.00065 0.3 1.0 10 0.00078 14.2 101.7 0.00012 0.2 0.7 -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, 15 Jul 2014, John Stultz wrote: > On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: > > On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: > >> Another area we have to be careful with is there are still > >> architectures (powerpc and ia64) which haven't switched from the old > >> vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these > >> cases we add up to 1ns of error each tick/update as we round up the > >> sub-nanoseconds to ensure we don't see inconsistencies. If the > >> adjustment logic can't handle this, I don't want to regress those > >> arches. > > I spent some time trying to figure out a workaround for the nanosecond > > rounding, but I didn't find anything that wouldn't complicate the mult > > adjustment logic and bring back the problems which the direct division > > approach is supposed to solve. > > > > It seems it may be a while before the old vsyscalls are fixed. How > > about including only the first two patches from this set for now? > > Hey! Sorry for the slow response here, I was on the road the last two weeks. > > So thanks for the ping here. If you're happy with the first two as an > initial step, I'd be willing to try to push those in. The only trouble > is there's a whole lot of timekeeping churn headed for 3.17 that Thomas > has cooked up. While there isn't likely to be direct conflicts in the > changes, I get nervous about mixing too many changes in subtle code at once. > > But maybe I'm being overly cautious? > > Thomas: Would you be ok if I push you some patches to improve ntp/ptp > freq steering w/ NOHZ for this merge window as well? So far the patches > have only shown nice improvements and no regressions, but its clearly > subtle code, and there's a lot going on in this merge window. I think it's ok. The steering is orthogonal to the other stuff cooking. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, 15 Jul 2014, John Stultz wrote: On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. I spent some time trying to figure out a workaround for the nanosecond rounding, but I didn't find anything that wouldn't complicate the mult adjustment logic and bring back the problems which the direct division approach is supposed to solve. It seems it may be a while before the old vsyscalls are fixed. How about including only the first two patches from this set for now? Hey! Sorry for the slow response here, I was on the road the last two weeks. So thanks for the ping here. If you're happy with the first two as an initial step, I'd be willing to try to push those in. The only trouble is there's a whole lot of timekeeping churn headed for 3.17 that Thomas has cooked up. While there isn't likely to be direct conflicts in the changes, I get nervous about mixing too many changes in subtle code at once. But maybe I'm being overly cautious? Thomas: Would you be ok if I push you some patches to improve ntp/ptp freq steering w/ NOHZ for this merge window as well? So far the patches have only shown nice improvements and no regressions, but its clearly subtle code, and there's a lot going on in this merge window. I think it's ok. The steering is orthogonal to the other stuff cooking. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Tue, Jul 15, 2014 at 09:02:38PM -0700, John Stultz wrote: On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: It seems it may be a while before the old vsyscalls are fixed. How about including only the first two patches from this set for now? Hey! Sorry for the slow response here, I was on the road the last two weeks. So thanks for the ping here. If you're happy with the first two as an initial step, I'd be willing to try to push those in. Ok, thanks. Thomas: Would you be ok if I push you some patches to improve ntp/ptp freq steering w/ NOHZ for this merge window as well? So far the patches have only shown nice improvements and no regressions, but its clearly subtle code, and there's a lot going on in this merge window. For the record, here are test results from the simulator. There is a regression in the performance with disabled nohz, it takes longer for the clock to converge, but I think it's acceptable. When compared on the same time scale, disabled nohz still seems to be significantly better than nohz enabled. 3.16-rc5: $ ./test1.sh freq10 freq100 dev max nohz on 38.383682.72579 1468940.9 9846788.2 nohz off3.89181 0.10436 0.2 0.6 3.16-rc5 + two patches from this set: $ ./test1.sh freq10 freq100 dev max nohz on 1.13791 0.05155 85.0313.7 nohz off1.02364 0.05149 0.5 1.7 Here are results from a new script I added to linux-tktest, which I think shows better how the clock is converging. 3.16-rc5: $ ./test2.sh nohz on [1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 32.08360 16039.4 23435.5 39.61782 11077.4 30938.0 30 13.84304 35033.8 80321.5 8.52682 21495.6 52639.6 100 0.68945 31395.7 86405.6 1.26818 26426.9 98448.1 300 0.18118 33906.1 104062.0 0.11419 21147.4 102418.1 1000 0.02699 32254.1 120619.4 0.02234 21222.3 120107.6 3000 0.00515 31210.4 131688.9 0.00525 23067.8 140999.4 10.00077 31904.1 149537.0 0.00079 21688.2 144957.5 30.00015 31422.9 158176.5 0.00015 21867.9 159288.5 10 0.3 31301.3 170099.5 0.3 22265.5 168985.6 nohz off[1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 6.64956 14.0 18.8 7.10428 2.4 5.1 30 2.87697 13.1 33.0 0.02541 0.1 0.4 100 0.38990 9.9 35.1 0.00502 0.2 0.5 300 0.04733 6.5 42.9 0.00096 0.2 0.5 1000 0.00437 3.7 46.2 0.00023 0.2 0.6 3000 0.00049 2.2 47.1 0.3 0.2 0.6 10.4 1.2 47.5 0.0 0.2 0.6 30.1 0.8 47.6 0.0 0.2 0.6 10 0.0 0.5 47.6 0.0 0.2 0.6 3.16-rc5 + two patches from this set: $ ./test2.sh nohz on [1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 1.472221341.32217.8 0.06322 0.2 0.5 30 0.20799 849.52448.7 0.06311 0.2 0.6 100 0.04101 492.12895.2 0.06311 0.2 0.5 300 0.05660 295.53026.1 0.02064 28.3 108.9 1000 0.01994 409.82732.1 0.00355 13.7 52.2 3000 0.00477 469.13238.9 0.00070 11.0 40.9 10.00081 377.33791.6 0.00013 9.4 36.2 30.00016 259.94055.7 0.4 8.9 34.1 10 0.3 159.04177.2 0.0 13.7 58.4 nohz off[1, samples/2] [samples/2 + 1, samples] samples freq dev max freq dev max 10 3.55062 6.2 10.8 0.05730 0.0 0.0 30 0.44672 4.5 14.1 0.05724 0.2 0.5 100 0.03649 2.7 17.4 0.05711 0.2 0.5 300 0.05815 1.7 18.7 0.06313 0.2 0.5 1000 0.06270 1.0 19.1 0.06315 0.2 0.5 3000 0.05720 1.9 19.9 0.02065 1.1 4.1 10.01947 13.5 41.0 0.00339 0.5 1.7 30.00448 17.5 75.9 0.00065 0.3 1.0 10 0.00078 14.2 101.7 0.00012 0.2 0.7 -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: > On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: >> Another area we have to be careful with is there are still >> architectures (powerpc and ia64) which haven't switched from the old >> vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these >> cases we add up to 1ns of error each tick/update as we round up the >> sub-nanoseconds to ensure we don't see inconsistencies. If the >> adjustment logic can't handle this, I don't want to regress those >> arches. > I spent some time trying to figure out a workaround for the nanosecond > rounding, but I didn't find anything that wouldn't complicate the mult > adjustment logic and bring back the problems which the direct division > approach is supposed to solve. > > It seems it may be a while before the old vsyscalls are fixed. How > about including only the first two patches from this set for now? Hey! Sorry for the slow response here, I was on the road the last two weeks. So thanks for the ping here. If you're happy with the first two as an initial step, I'd be willing to try to push those in. The only trouble is there's a whole lot of timekeeping churn headed for 3.17 that Thomas has cooked up. While there isn't likely to be direct conflicts in the changes, I get nervous about mixing too many changes in subtle code at once. But maybe I'm being overly cautious? Thomas: Would you be ok if I push you some patches to improve ntp/ptp freq steering w/ NOHZ for this merge window as well? So far the patches have only shown nice improvements and no regressions, but its clearly subtle code, and there's a lot going on in this merge window. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On 07/08/2014 04:08 AM, Miroslav Lichvar wrote: On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. I spent some time trying to figure out a workaround for the nanosecond rounding, but I didn't find anything that wouldn't complicate the mult adjustment logic and bring back the problems which the direct division approach is supposed to solve. It seems it may be a while before the old vsyscalls are fixed. How about including only the first two patches from this set for now? Hey! Sorry for the slow response here, I was on the road the last two weeks. So thanks for the ping here. If you're happy with the first two as an initial step, I'd be willing to try to push those in. The only trouble is there's a whole lot of timekeeping churn headed for 3.17 that Thomas has cooked up. While there isn't likely to be direct conflicts in the changes, I get nervous about mixing too many changes in subtle code at once. But maybe I'm being overly cautious? Thomas: Would you be ok if I push you some patches to improve ntp/ptp freq steering w/ NOHZ for this merge window as well? So far the patches have only shown nice improvements and no regressions, but its clearly subtle code, and there's a lot going on in this merge window. thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: > Another area we have to be careful with is there are still > architectures (powerpc and ia64) which haven't switched from the old > vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these > cases we add up to 1ns of error each tick/update as we round up the > sub-nanoseconds to ensure we don't see inconsistencies. If the > adjustment logic can't handle this, I don't want to regress those > arches. I spent some time trying to figure out a workaround for the nanosecond rounding, but I didn't find anything that wouldn't complicate the mult adjustment logic and bring back the problems which the direct division approach is supposed to solve. It seems it may be a while before the old vsyscalls are fixed. How about including only the first two patches from this set for now? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. I spent some time trying to figure out a workaround for the nanosecond rounding, but I didn't find anything that wouldn't complicate the mult adjustment logic and bring back the problems which the direct division approach is supposed to solve. It seems it may be a while before the old vsyscalls are fixed. How about including only the first two patches from this set for now? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: > On Mon, May 19, 2014 at 3:14 AM, Miroslav Lichvar wrote: > > Ok, so it seems to be almost identical to my patch now. The only two > > differences seem to be the removal of the ntp_error correction to > > change the effective clock frequency at the current time instead of > > aligning it to the start of the tick and the flag to handle the xtime > > underflow. > So as for the ntp_error correction when we adjust the multiplier, I > believe that is still needed, since when we adjust the freq we move > the xtime_nsec value, which means we need to adjust the error from > that xtime_nsec value to the ideal value at that point. But if you > can provide a better rational as to why it can be safely dropped, I'd > be happy to listen. Ok, I tried to make a drawing to explain it better http://mlichvar.fedorapeople.org/tmp/mult_update.png My understanding of the current code is this. The update of mult value doesn't happen exactly on tick boundary, so xtime needs to be corrected to not break the continuity of the observed time (i.e. time with old mult/xtime is equal to the time with new mult/xtime at the time of the update), that's the start of the blue line in the drawing. To make it appear as if the frequency was adjusted on tick boundary (green line), the xtime correction is added to ntp_error and the time is slowly corrected in the following updates (red line). I think that ntp_error correction is not necessary and we can stick to the blue line. The frequency of the clock was changed at the current time instead of the start of the current tick. > So one part of my concern is I'm not as comfortable bending over > backwards all the time to avoid ntp_error. Sure, if we can avoid it in > the common case, that makes things better, but having occasional > bursts for good reason (like the underflow issue) I think is important > so we can ensure that the logic to converge does actually function > with larger errors. It will converge but very slowly, I'm not sure if that's acceptable even if it happens rarely. If it was able to correct larger errors quickly, we would have the other problems we were trying to fix back again. > If we constrain all the cases where error can be > added, I worry it will make the correction logic fragile in the end. > But I'm not totally settled on this. The correction logic is simple and robust, but slow. It can be used only if all these sources of the error are removed. > Another area we have to be careful with is there are still > architectures (powerpc and ia64) which haven't switched from the old > vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these > cases we add up to 1ns of error each tick/update as we round up the > sub-nanoseconds to ensure we don't see inconsistencies. If the > adjustment logic can't handle this, I don't want to regress those > arches. Yes, this would probably need to be fixed too. If converting the remaining vsyscalls is difficult, maybe a short-term fix could be adding a new register for the rounding error and use it to select which direction should be xtime rounded instead of always rounding up? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 10:57:29AM -0700, John Stultz wrote: On Mon, May 19, 2014 at 3:14 AM, Miroslav Lichvar mlich...@redhat.com wrote: Ok, so it seems to be almost identical to my patch now. The only two differences seem to be the removal of the ntp_error correction to change the effective clock frequency at the current time instead of aligning it to the start of the tick and the flag to handle the xtime underflow. So as for the ntp_error correction when we adjust the multiplier, I believe that is still needed, since when we adjust the freq we move the xtime_nsec value, which means we need to adjust the error from that xtime_nsec value to the ideal value at that point. But if you can provide a better rational as to why it can be safely dropped, I'd be happy to listen. Ok, I tried to make a drawing to explain it better http://mlichvar.fedorapeople.org/tmp/mult_update.png My understanding of the current code is this. The update of mult value doesn't happen exactly on tick boundary, so xtime needs to be corrected to not break the continuity of the observed time (i.e. time with old mult/xtime is equal to the time with new mult/xtime at the time of the update), that's the start of the blue line in the drawing. To make it appear as if the frequency was adjusted on tick boundary (green line), the xtime correction is added to ntp_error and the time is slowly corrected in the following updates (red line). I think that ntp_error correction is not necessary and we can stick to the blue line. The frequency of the clock was changed at the current time instead of the start of the current tick. So one part of my concern is I'm not as comfortable bending over backwards all the time to avoid ntp_error. Sure, if we can avoid it in the common case, that makes things better, but having occasional bursts for good reason (like the underflow issue) I think is important so we can ensure that the logic to converge does actually function with larger errors. It will converge but very slowly, I'm not sure if that's acceptable even if it happens rarely. If it was able to correct larger errors quickly, we would have the other problems we were trying to fix back again. If we constrain all the cases where error can be added, I worry it will make the correction logic fragile in the end. But I'm not totally settled on this. The correction logic is simple and robust, but slow. It can be used only if all these sources of the error are removed. Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. Yes, this would probably need to be fixed too. If converting the remaining vsyscalls is difficult, maybe a short-term fix could be adding a new register for the rounding error and use it to select which direction should be xtime rounded instead of always rounding up? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 3:14 AM, Miroslav Lichvar wrote: > On Fri, May 16, 2014 at 05:56:41PM -0700, John Stultz wrote: >> This version of the patch set corrects a few issues Miroslav pointed >> out, as well as adapts his approach almost completely for the last >> patch. This pulls the results in to be very close to his original >> patch. > > Ok, so it seems to be almost identical to my patch now. The only two > differences seem to be the removal of the ntp_error correction to > change the effective clock frequency at the current time instead of > aligning it to the start of the tick and the flag to handle the xtime > underflow. > > Can we get them in too? So as for the ntp_error correction when we adjust the multiplier, I believe that is still needed, since when we adjust the freq we move the xtime_nsec value, which means we need to adjust the error from that xtime_nsec value to the ideal value at that point. But if you can provide a better rational as to why it can be safely dropped, I'd be happy to listen. As for the xtime underflow handling, I'm still hesitant here since its just a bit ugly. :) So I'm trying to think of a cleaner way. > I think both are necessary to avoid having large steps in ntp_error > which take long time to correct. You can see this in the nohz off > freq100 result from the simulator, for example. So one part of my concern is I'm not as comfortable bending over backwards all the time to avoid ntp_error. Sure, if we can avoid it in the common case, that makes things better, but having occasional bursts for good reason (like the underflow issue) I think is important so we can ensure that the logic to converge does actually function with larger errors. If we constrain all the cases where error can be added, I worry it will make the correction logic fragile in the end. But I'm not totally settled on this. Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. >> I'm not 100% sure we need the last patch in this series, as >> it has additional computational cost and testing on real >> hardware has shown NOHZ=y performance matching NOHZ=n with a >> earlier version of just the first patch: >> https://lkml.org/lkml/2014/1/13/501 >> (though admittedly, the patch has changed since Richard's testing, >> so the results are a bit stale). > > If I could get a sufficiently low update rate on real HW or a more > accurate reference clock, I think I would be able to show you a > difference. But even if we can't at this time, the removal of the > complex feedback loop, which as you saw is difficult to keep working > well in all cases (and we have tested only a very small number of > them), is probably worth the extra computational cost. I do like the straightforward aspect of calculating the freq adjustment, but I still am worried about particularly 32bit systems where the extra 64bit divide and multiplies may add quite a bit of overhead in the irq path (even if it is only done ~once a second). But I'm going to try to get some measurement numbers to validate that. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 16, 2014 at 05:56:41PM -0700, John Stultz wrote: > This version of the patch set corrects a few issues Miroslav pointed > out, as well as adapts his approach almost completely for the last > patch. This pulls the results in to be very close to his original > patch. Ok, so it seems to be almost identical to my patch now. The only two differences seem to be the removal of the ntp_error correction to change the effective clock frequency at the current time instead of aligning it to the start of the tick and the flag to handle the xtime underflow. Can we get them in too? I think both are necessary to avoid having large steps in ntp_error which take long time to correct. You can see this in the nohz off freq100 result from the simulator, for example. > I'm not 100% sure we need the last patch in this series, as > it has additional computational cost and testing on real > hardware has shown NOHZ=y performance matching NOHZ=n with a > earlier version of just the first patch: > https://lkml.org/lkml/2014/1/13/501 > (though admittedly, the patch has changed since Richard's testing, > so the results are a bit stale). If I could get a sufficiently low update rate on real HW or a more accurate reference clock, I think I would be able to show you a difference. But even if we can't at this time, the removal of the complex feedback loop, which as you saw is difficult to keep working well in all cases (and we have tested only a very small number of them), is probably worth the extra computational cost. Thanks, > Below are some of the simulator results comparing this patchset > to vanilla and Miroslav's original patch. > Miroslav's original patch: > -- > > $ ./test1.sh > freq10 freq100 dev max > nohz on 0.00601 0.00028 74.0279.4 > nohz off 0.05867 0.00204 0.2 0.6 > This patchset: > -- > > $ ./test1.sh > freq10 freq100 dev max > nohz on 0.00582 0.00033 74.1279.9 > nohz off 0.06275 0.06440 0.4 1.4 -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Fri, May 16, 2014 at 05:56:41PM -0700, John Stultz wrote: This version of the patch set corrects a few issues Miroslav pointed out, as well as adapts his approach almost completely for the last patch. This pulls the results in to be very close to his original patch. Ok, so it seems to be almost identical to my patch now. The only two differences seem to be the removal of the ntp_error correction to change the effective clock frequency at the current time instead of aligning it to the start of the tick and the flag to handle the xtime underflow. Can we get them in too? I think both are necessary to avoid having large steps in ntp_error which take long time to correct. You can see this in the nohz off freq100 result from the simulator, for example. I'm not 100% sure we need the last patch in this series, as it has additional computational cost and testing on real hardware has shown NOHZ=y performance matching NOHZ=n with a earlier version of just the first patch: https://lkml.org/lkml/2014/1/13/501 (though admittedly, the patch has changed since Richard's testing, so the results are a bit stale). If I could get a sufficiently low update rate on real HW or a more accurate reference clock, I think I would be able to show you a difference. But even if we can't at this time, the removal of the complex feedback loop, which as you saw is difficult to keep working well in all cases (and we have tested only a very small number of them), is probably worth the extra computational cost. Thanks, Below are some of the simulator results comparing this patchset to vanilla and Miroslav's original patch. Miroslav's original patch: -- $ ./test1.sh freq10 freq100 dev max nohz on 0.00601 0.00028 74.0279.4 nohz off 0.05867 0.00204 0.2 0.6 This patchset: -- $ ./test1.sh freq10 freq100 dev max nohz on 0.00582 0.00033 74.1279.9 nohz off 0.06275 0.06440 0.4 1.4 -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering (v2)
On Mon, May 19, 2014 at 3:14 AM, Miroslav Lichvar mlich...@redhat.com wrote: On Fri, May 16, 2014 at 05:56:41PM -0700, John Stultz wrote: This version of the patch set corrects a few issues Miroslav pointed out, as well as adapts his approach almost completely for the last patch. This pulls the results in to be very close to his original patch. Ok, so it seems to be almost identical to my patch now. The only two differences seem to be the removal of the ntp_error correction to change the effective clock frequency at the current time instead of aligning it to the start of the tick and the flag to handle the xtime underflow. Can we get them in too? So as for the ntp_error correction when we adjust the multiplier, I believe that is still needed, since when we adjust the freq we move the xtime_nsec value, which means we need to adjust the error from that xtime_nsec value to the ideal value at that point. But if you can provide a better rational as to why it can be safely dropped, I'd be happy to listen. As for the xtime underflow handling, I'm still hesitant here since its just a bit ugly. :) So I'm trying to think of a cleaner way. I think both are necessary to avoid having large steps in ntp_error which take long time to correct. You can see this in the nohz off freq100 result from the simulator, for example. So one part of my concern is I'm not as comfortable bending over backwards all the time to avoid ntp_error. Sure, if we can avoid it in the common case, that makes things better, but having occasional bursts for good reason (like the underflow issue) I think is important so we can ensure that the logic to converge does actually function with larger errors. If we constrain all the cases where error can be added, I worry it will make the correction logic fragile in the end. But I'm not totally settled on this. Another area we have to be careful with is there are still architectures (powerpc and ia64) which haven't switched from the old vsyscall rounding logic (CONFIG_GENERIC_TIME_VSYSCALL_OLD). In these cases we add up to 1ns of error each tick/update as we round up the sub-nanoseconds to ensure we don't see inconsistencies. If the adjustment logic can't handle this, I don't want to regress those arches. I'm not 100% sure we need the last patch in this series, as it has additional computational cost and testing on real hardware has shown NOHZ=y performance matching NOHZ=n with a earlier version of just the first patch: https://lkml.org/lkml/2014/1/13/501 (though admittedly, the patch has changed since Richard's testing, so the results are a bit stale). If I could get a sufficiently low update rate on real HW or a more accurate reference clock, I think I would be able to show you a difference. But even if we can't at this time, the removal of the complex feedback loop, which as you saw is difficult to keep working well in all cases (and we have tested only a very small number of them), is probably worth the extra computational cost. I do like the straightforward aspect of calculating the freq adjustment, but I still am worried about particularly 32bit systems where the extra 64bit divide and multiplies may add quite a bit of overhead in the irq path (even if it is only done ~once a second). But I'm going to try to get some measurement numbers to validate that. thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: > On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: >> Continuing the sporadic work on improving the timekeeping >> frequency steering logic when NOHZ is enabled, I've made a number >> of changes to my re-implementation of Miroslav's patch (most >> recently posted here: https://lkml.org/lkml/2014/2/12/401 ), >> and I'm getting much closer results in the simulator. > Thanks, in my initial testing it seems to be working well. The results > from the simulator are much better than with the previous patch. > >> Compared with Miroslav's patch, this avoids doing any extra >> divisions, and instead approximates the correction >> logarithmically. > Hm, doesn't that basically make the code a software implementation of > division? It seems it needs about 4-8 iterations to get to the final > result. > > I didn't measure it, but I think with this change it now may be close > or possibly even slower than my patch. The extra division and > multiplication in my patch are used only when the tick length changes > (normally once per second), otherwise the update is very cheap. Fair enough. I've moved to your division method for the last patch in my series (instead of looping). I'm still not 100% convinced we need it, since I'm still worried the error seen in the simulator with the default options is somewhat contrived. For example, since it regularly skips 4k ticks, that exaggerates the accumulated error in the approximation case. But we can discuss this further w/ the new patch series. thanks! -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/30/2014 07:01 AM, Miroslav Lichvar wrote: > On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote: >> On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: >>> It seems it still doesn't always switch mult only between the two >>> closest values, which explains the slightly worse dev and max values. >> Huh. I don't think I saw that in my testing. I'll look into it again. > I can see it with tk_test -o 10, for instance. It's switching > between 8389446, 8389447 and 8389448. Ok, I think I sorted this part out. Thanks for the heads up here! > >> I suspect the extra error comes from the occasional underflow handling >> (which you avoid w/ the second_overflow_skip stuff which would help but >> feels a little clunky to me - but I'm still thinking it over). > It seems to be something else as I can see it even when I remove > "advance_ticks(3, 4, 1);" from tk_test.c so clock updates are aligned > exactly with ticks and no underflow can happen (i.e. offset in > timekeeping_apply_adjustment() is zero). > > I agree the skip_second_overflow flag in my patch is ugly, but it's > necesssary as the code would otherwise take too long to correct the > underflowed part in ntp error. > > Anyway, I did more testing and I think I found a more serious problem. > It seems the loop doesn't handle well tick lengths which happen to be > close to the middle between multipliers. For example: > > $ ./tk_test -n 1 -o 100077 > samples: 1-1 reg: 1-1 slope: 1.00 dev: 1241.7 max: 3532.3 freq: > 100.07717 > > When I add the following line to the kernel code to see the value of > mult and ntp_error after clock update: > > +++ b/kernel/time/timekeeping.c > @@ -1386,6 +1386,7 @@ void update_wall_time(void) > /* correct the clock when NTP error is too big */ > timekeeping_adjust(tk, offset); > > + printk("%d %lld\n", tk->mult, tk->ntp_error >> (tk->ntp_error_shift + > tk->shift)); > > I get this: > > 8389447 -101 > 8389449 6 > 8389447 -321 > 8389448 -198 > 8389447 -249 > ... > 8389447 -6344 > 8389448 -6158 > 8389447 -6223 > 8389448 -6211 > 8389447 -6265 > 8389448 -6029 > > It looks like the correction is not able to handle the random > cumulation of differences in the lengths between odd and even update > intervals. The overall frequency is accurate, but ntp error is in > microseconds here. Yea, in the freqadjust logic, we chose to do nothing if it was inbetween 0 to (interval/2). The problem being that interval/2 is too small target, and if we get too close a single unit adjustment may bounce us on either side of that range. Changed the comparision to being the 0-interval (inclusive) which should assure the approximation will land in that range. New patchset to follow shortly! thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/30/2014 07:01 AM, Miroslav Lichvar wrote: On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote: On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: It seems it still doesn't always switch mult only between the two closest values, which explains the slightly worse dev and max values. Huh. I don't think I saw that in my testing. I'll look into it again. I can see it with tk_test -o 10, for instance. It's switching between 8389446, 8389447 and 8389448. Ok, I think I sorted this part out. Thanks for the heads up here! I suspect the extra error comes from the occasional underflow handling (which you avoid w/ the second_overflow_skip stuff which would help but feels a little clunky to me - but I'm still thinking it over). It seems to be something else as I can see it even when I remove advance_ticks(3, 4, 1); from tk_test.c so clock updates are aligned exactly with ticks and no underflow can happen (i.e. offset in timekeeping_apply_adjustment() is zero). I agree the skip_second_overflow flag in my patch is ugly, but it's necesssary as the code would otherwise take too long to correct the underflowed part in ntp error. Anyway, I did more testing and I think I found a more serious problem. It seems the loop doesn't handle well tick lengths which happen to be close to the middle between multipliers. For example: $ ./tk_test -n 1 -o 100077 samples: 1-1 reg: 1-1 slope: 1.00 dev: 1241.7 max: 3532.3 freq: 100.07717 When I add the following line to the kernel code to see the value of mult and ntp_error after clock update: +++ b/kernel/time/timekeeping.c @@ -1386,6 +1386,7 @@ void update_wall_time(void) /* correct the clock when NTP error is too big */ timekeeping_adjust(tk, offset); + printk(%d %lld\n, tk-mult, tk-ntp_error (tk-ntp_error_shift + tk-shift)); I get this: 8389447 -101 8389449 6 8389447 -321 8389448 -198 8389447 -249 ... 8389447 -6344 8389448 -6158 8389447 -6223 8389448 -6211 8389447 -6265 8389448 -6029 It looks like the correction is not able to handle the random cumulation of differences in the lengths between odd and even update intervals. The overall frequency is accurate, but ntp error is in microseconds here. Yea, in the freqadjust logic, we chose to do nothing if it was inbetween 0 to (interval/2). The problem being that interval/2 is too small target, and if we get too close a single unit adjustment may bounce us on either side of that range. Changed the comparision to being the 0-interval (inclusive) which should assure the approximation will land in that range. New patchset to follow shortly! thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: Continuing the sporadic work on improving the timekeeping frequency steering logic when NOHZ is enabled, I've made a number of changes to my re-implementation of Miroslav's patch (most recently posted here: https://lkml.org/lkml/2014/2/12/401 ), and I'm getting much closer results in the simulator. Thanks, in my initial testing it seems to be working well. The results from the simulator are much better than with the previous patch. Compared with Miroslav's patch, this avoids doing any extra divisions, and instead approximates the correction logarithmically. Hm, doesn't that basically make the code a software implementation of division? It seems it needs about 4-8 iterations to get to the final result. I didn't measure it, but I think with this change it now may be close or possibly even slower than my patch. The extra division and multiplication in my patch are used only when the tick length changes (normally once per second), otherwise the update is very cheap. Fair enough. I've moved to your division method for the last patch in my series (instead of looping). I'm still not 100% convinced we need it, since I'm still worried the error seen in the simulator with the default options is somewhat contrived. For example, since it regularly skips 4k ticks, that exaggerates the accumulated error in the approximation case. But we can discuss this further w/ the new patch series. thanks! -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote: > On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: > > It seems it still doesn't always switch mult only between the two > > closest values, which explains the slightly worse dev and max values. > Huh. I don't think I saw that in my testing. I'll look into it again. I can see it with tk_test -o 10, for instance. It's switching between 8389446, 8389447 and 8389448. > I suspect the extra error comes from the occasional underflow handling > (which you avoid w/ the second_overflow_skip stuff which would help but > feels a little clunky to me - but I'm still thinking it over). It seems to be something else as I can see it even when I remove "advance_ticks(3, 4, 1);" from tk_test.c so clock updates are aligned exactly with ticks and no underflow can happen (i.e. offset in timekeeping_apply_adjustment() is zero). I agree the skip_second_overflow flag in my patch is ugly, but it's necesssary as the code would otherwise take too long to correct the underflowed part in ntp error. Anyway, I did more testing and I think I found a more serious problem. It seems the loop doesn't handle well tick lengths which happen to be close to the middle between multipliers. For example: $ ./tk_test -n 1 -o 100077 samples: 1-1 reg: 1-1 slope: 1.00 dev: 1241.7 max: 3532.3 freq: 100.07717 When I add the following line to the kernel code to see the value of mult and ntp_error after clock update: +++ b/kernel/time/timekeeping.c @@ -1386,6 +1386,7 @@ void update_wall_time(void) /* correct the clock when NTP error is too big */ timekeeping_adjust(tk, offset); + printk("%d %lld\n", tk->mult, tk->ntp_error >> (tk->ntp_error_shift + tk->shift)); I get this: 8389447 -101 8389449 6 8389447 -321 8389448 -198 8389447 -249 ... 8389447 -6344 8389448 -6158 8389447 -6223 8389448 -6211 8389447 -6265 8389448 -6029 It looks like the correction is not able to handle the random cumulation of differences in the lengths between odd and even update intervals. The overall frequency is accurate, but ntp error is in microseconds here. Can you please look into this? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote: On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: It seems it still doesn't always switch mult only between the two closest values, which explains the slightly worse dev and max values. Huh. I don't think I saw that in my testing. I'll look into it again. I can see it with tk_test -o 10, for instance. It's switching between 8389446, 8389447 and 8389448. I suspect the extra error comes from the occasional underflow handling (which you avoid w/ the second_overflow_skip stuff which would help but feels a little clunky to me - but I'm still thinking it over). It seems to be something else as I can see it even when I remove advance_ticks(3, 4, 1); from tk_test.c so clock updates are aligned exactly with ticks and no underflow can happen (i.e. offset in timekeeping_apply_adjustment() is zero). I agree the skip_second_overflow flag in my patch is ugly, but it's necesssary as the code would otherwise take too long to correct the underflowed part in ntp error. Anyway, I did more testing and I think I found a more serious problem. It seems the loop doesn't handle well tick lengths which happen to be close to the middle between multipliers. For example: $ ./tk_test -n 1 -o 100077 samples: 1-1 reg: 1-1 slope: 1.00 dev: 1241.7 max: 3532.3 freq: 100.07717 When I add the following line to the kernel code to see the value of mult and ntp_error after clock update: +++ b/kernel/time/timekeeping.c @@ -1386,6 +1386,7 @@ void update_wall_time(void) /* correct the clock when NTP error is too big */ timekeeping_adjust(tk, offset); + printk(%d %lld\n, tk-mult, tk-ntp_error (tk-ntp_error_shift + tk-shift)); I get this: 8389447 -101 8389449 6 8389447 -321 8389448 -198 8389447 -249 ... 8389447 -6344 8389448 -6158 8389447 -6223 8389448 -6211 8389447 -6265 8389448 -6029 It looks like the correction is not able to handle the random cumulation of differences in the lengths between odd and even update intervals. The overall frequency is accurate, but ntp error is in microseconds here. Can you please look into this? Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/24/2014 07:04 PM, John Stultz wrote: > Continuing the sporadic work on improving the timekeeping > frequency steering logic when NOHZ is enabled, I've made a number > of changes to my re-implementation of Miroslav's patch (most > recently posted here: https://lkml.org/lkml/2014/2/12/401 ), > and I'm getting much closer results in the simulator. > > Compared with Miroslav's patch, this avoids doing any extra > divisions, and instead approximates the correction > logarithmically. In addition to trying to lower overhead, this > allows the other accounting bits to be managed in much the > same way we have been doing for awhile. > > This patch set contains one portion from Miroslav's patch > (ntp_tick caching) split out into a separate logical patch. > > Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc, > you'll need to revert b399fe355b30d01 to get the simulator building) > comparing this patchset to Miroslav's patch to show we're getting in > the right order of magnitude. > > Tested-by: Prarit Bhargava P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/24/2014 07:04 PM, John Stultz wrote: Continuing the sporadic work on improving the timekeeping frequency steering logic when NOHZ is enabled, I've made a number of changes to my re-implementation of Miroslav's patch (most recently posted here: https://lkml.org/lkml/2014/2/12/401 ), and I'm getting much closer results in the simulator. Compared with Miroslav's patch, this avoids doing any extra divisions, and instead approximates the correction logarithmically. In addition to trying to lower overhead, this allows the other accounting bits to be managed in much the same way we have been doing for awhile. This patch set contains one portion from Miroslav's patch (ntp_tick caching) split out into a separate logical patch. Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc, you'll need to revert b399fe355b30d01 to get the simulator building) comparing this patchset to Miroslav's patch to show we're getting in the right order of magnitude. Tested-by: Prarit Bhargava pra...@redhat.com P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: > On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: >> Continuing the sporadic work on improving the timekeeping >> frequency steering logic when NOHZ is enabled, I've made a number >> of changes to my re-implementation of Miroslav's patch (most >> recently posted here: https://lkml.org/lkml/2014/2/12/401 ), >> and I'm getting much closer results in the simulator. > Thanks, in my initial testing it seems to be working well. The results > from the simulator are much better than with the previous patch. > >> Compared with Miroslav's patch, this avoids doing any extra >> divisions, and instead approximates the correction >> logarithmically. > Hm, doesn't that basically make the code a software implementation of > division? It seems it needs about 4-8 iterations to get to the final > result. > > I didn't measure it, but I think with this change it now may be close > or possibly even slower than my patch. The extra division and > multiplication in my patch are used only when the tick length changes > (normally once per second), otherwise the update is very cheap. This point is quite true, though I'm really not convinced the last patch in the series is realistically needed. It does help the code get close to your implementation in comparisons w/ the simulator but I haven't had the chance to sort out what the real world impact is likely to be. That said, you're points about just setting the freq rather then trying to converge towards it to eliminate the second layer of correction oscillation is starting to resonate more with me. Thus I may very well switch to the division, but if so I also want to get the manipulations to the supporting accounting variables similarly documented as the scaled manipulation currently provides. > >> Miroslav's patch: >> - >> $ ./test1.sh >> freq10 freq100 dev max >> nohz on 0.00601 0.00028 74.0279.4 >> nohz off 0.05867 0.00204 0.2 0.6 >> This patchset: >> -- >> $ ./test1.sh >> freq10 freq100 dev max >> nohz on 0.00748 0.00076 110.8 476.4 >> nohz off 0.07173 0.03590 0.6 2.1 > This looks pretty good to me. It's interesting that the performance > with nohz off got worse, but when I modify the test to use more points > I can see it does converge to the correct frequency and probably it's > not a big problem. > > It seems it still doesn't always switch mult only between the two > closest values, which explains the slightly worse dev and max values. Huh. I don't think I saw that in my testing. I'll look into it again. I suspect the extra error comes from the occasional underflow handling (which you avoid w/ the second_overflow_skip stuff which would help but feels a little clunky to me - but I'm still thinking it over). thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/24/2014 07:04 PM, John Stultz wrote: > Continuing the sporadic work on improving the timekeeping > frequency steering logic when NOHZ is enabled, I've made a number > of changes to my re-implementation of Miroslav's patch (most > recently posted here: https://lkml.org/lkml/2014/2/12/401 ), > and I'm getting much closer results in the simulator. > > Compared with Miroslav's patch, this avoids doing any extra > divisions, and instead approximates the correction > logarithmically. In addition to trying to lower overhead, this > allows the other accounting bits to be managed in much the > same way we have been doing for awhile. > > This patch set contains one portion from Miroslav's patch > (ntp_tick caching) split out into a separate logical patch. > > Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc, > you'll need to revert b399fe355b30d01 to get the simulator building) > comparing this patchset to Miroslav's patch to show we're getting in > the right order of magnitude. > John, I'm going to put this into a weekend long test right away (along with another unrelated cpupower patch). I'll ping back Monday to let you know how it went. P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: > Continuing the sporadic work on improving the timekeeping > frequency steering logic when NOHZ is enabled, I've made a number > of changes to my re-implementation of Miroslav's patch (most > recently posted here: https://lkml.org/lkml/2014/2/12/401 ), > and I'm getting much closer results in the simulator. Thanks, in my initial testing it seems to be working well. The results from the simulator are much better than with the previous patch. > Compared with Miroslav's patch, this avoids doing any extra > divisions, and instead approximates the correction > logarithmically. Hm, doesn't that basically make the code a software implementation of division? It seems it needs about 4-8 iterations to get to the final result. I didn't measure it, but I think with this change it now may be close or possibly even slower than my patch. The extra division and multiplication in my patch are used only when the tick length changes (normally once per second), otherwise the update is very cheap. > Miroslav's patch: > - > $ ./test1.sh > freq10 freq100 dev max > nohz on 0.00601 0.00028 74.0279.4 > nohz off 0.05867 0.00204 0.2 0.6 > This patchset: > -- > $ ./test1.sh > freq10 freq100 dev max > nohz on 0.00748 0.00076 110.8 476.4 > nohz off 0.07173 0.03590 0.6 2.1 This looks pretty good to me. It's interesting that the performance with nohz off got worse, but when I modify the test to use more points I can see it does converge to the correct frequency and probably it's not a big problem. It seems it still doesn't always switch mult only between the two closest values, which explains the slightly worse dev and max values. Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: Continuing the sporadic work on improving the timekeeping frequency steering logic when NOHZ is enabled, I've made a number of changes to my re-implementation of Miroslav's patch (most recently posted here: https://lkml.org/lkml/2014/2/12/401 ), and I'm getting much closer results in the simulator. Thanks, in my initial testing it seems to be working well. The results from the simulator are much better than with the previous patch. Compared with Miroslav's patch, this avoids doing any extra divisions, and instead approximates the correction logarithmically. Hm, doesn't that basically make the code a software implementation of division? It seems it needs about 4-8 iterations to get to the final result. I didn't measure it, but I think with this change it now may be close or possibly even slower than my patch. The extra division and multiplication in my patch are used only when the tick length changes (normally once per second), otherwise the update is very cheap. Miroslav's patch: - $ ./test1.sh freq10 freq100 dev max nohz on 0.00601 0.00028 74.0279.4 nohz off 0.05867 0.00204 0.2 0.6 This patchset: -- $ ./test1.sh freq10 freq100 dev max nohz on 0.00748 0.00076 110.8 476.4 nohz off 0.07173 0.03590 0.6 2.1 This looks pretty good to me. It's interesting that the performance with nohz off got worse, but when I modify the test to use more points I can see it does converge to the correct frequency and probably it's not a big problem. It seems it still doesn't always switch mult only between the two closest values, which explains the slightly worse dev and max values. Thanks, -- Miroslav Lichvar -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/24/2014 07:04 PM, John Stultz wrote: Continuing the sporadic work on improving the timekeeping frequency steering logic when NOHZ is enabled, I've made a number of changes to my re-implementation of Miroslav's patch (most recently posted here: https://lkml.org/lkml/2014/2/12/401 ), and I'm getting much closer results in the simulator. Compared with Miroslav's patch, this avoids doing any extra divisions, and instead approximates the correction logarithmically. In addition to trying to lower overhead, this allows the other accounting bits to be managed in much the same way we have been doing for awhile. This patch set contains one portion from Miroslav's patch (ntp_tick caching) split out into a separate logical patch. Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc, you'll need to revert b399fe355b30d01 to get the simulator building) comparing this patchset to Miroslav's patch to show we're getting in the right order of magnitude. John, I'm going to put this into a weekend long test right away (along with another unrelated cpupower patch). I'll ping back Monday to let you know how it went. P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
On 04/25/2014 07:04 AM, Miroslav Lichvar wrote: On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote: Continuing the sporadic work on improving the timekeeping frequency steering logic when NOHZ is enabled, I've made a number of changes to my re-implementation of Miroslav's patch (most recently posted here: https://lkml.org/lkml/2014/2/12/401 ), and I'm getting much closer results in the simulator. Thanks, in my initial testing it seems to be working well. The results from the simulator are much better than with the previous patch. Compared with Miroslav's patch, this avoids doing any extra divisions, and instead approximates the correction logarithmically. Hm, doesn't that basically make the code a software implementation of division? It seems it needs about 4-8 iterations to get to the final result. I didn't measure it, but I think with this change it now may be close or possibly even slower than my patch. The extra division and multiplication in my patch are used only when the tick length changes (normally once per second), otherwise the update is very cheap. This point is quite true, though I'm really not convinced the last patch in the series is realistically needed. It does help the code get close to your implementation in comparisons w/ the simulator but I haven't had the chance to sort out what the real world impact is likely to be. That said, you're points about just setting the freq rather then trying to converge towards it to eliminate the second layer of correction oscillation is starting to resonate more with me. Thus I may very well switch to the division, but if so I also want to get the manipulations to the supporting accounting variables similarly documented as the scaled manipulation currently provides. Miroslav's patch: - $ ./test1.sh freq10 freq100 dev max nohz on 0.00601 0.00028 74.0279.4 nohz off 0.05867 0.00204 0.2 0.6 This patchset: -- $ ./test1.sh freq10 freq100 dev max nohz on 0.00748 0.00076 110.8 476.4 nohz off 0.07173 0.03590 0.6 2.1 This looks pretty good to me. It's interesting that the performance with nohz off got worse, but when I modify the test to use more points I can see it does converge to the correct frequency and probably it's not a big problem. It seems it still doesn't always switch mult only between the two closest values, which explains the slightly worse dev and max values. Huh. I don't think I saw that in my testing. I'll look into it again. I suspect the extra error comes from the occasional underflow handling (which you avoid w/ the second_overflow_skip stuff which would help but feels a little clunky to me - but I'm still thinking it over). thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/