Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
I successfully tested these four patches on a HiKey (ARMv8) board using a 4.9 kernel. I cherry picked various commits from upstream that touched kernel/time/timekeeping.c to ensure that John's patches apply cleanly. The inconsistency-check from kselftest that previously failed because of CLOCK_MONOTONIC_RAW jumping backwards by 1 ns is now passing. Tested-by: Daniel MentzOn Tue, May 30, 2017 at 5:10 PM, Daniel Mentz wrote: > I successfully tested these four patches on a HiKey (ARMv8) board using a > 4.9 kernel. I cherry picked various commits from upstream that touched > kernel/time/timekeeping.c to ensure that John's patches apply cleanly. The > inconsistency-check from kselftest that previously failed because of > CLOCK_MONOTONIC_RAW jumping backwards by 1 ns is now passing. > > Tested-by: Daniel Mentz > > On Fri, May 26, 2017 at 8:33 PM, John Stultz wrote: >> >> As part of the Linaro Linux Kernel Functional Test (LKFT) >> effort, test failures from kselftest/timer's >> inconsistency-check were reported connected to >> CLOCK_MONOTONIC_RAW, on the HiKey platform. >> >> Digging in I found that an old issue with how sub-ns accounting >> is handled with the RAW time which was fixed long ago with the >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was >> present. >> >> Additionally, running further tests, I uncovered an issue with >> how the clocksource read function is handled when clocksources >> are changed, which can cause crashes. >> >> Both of these issues have not been uncovered in x86 based >> testing due to x86 not using vDSO to accelerate >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer >> clocksource being fast to access but incrementing slowly enough >> to get multiple reads using the same counter value (which helps >> uncover time handing issues), along with the fact that none of >> the x86 clocksources making use of the clocksource argument >> passed to the read function. >> >> This patchset addresses these two issues. >> >> Thanks so much to help from Will Deacon in getting the needed >> adjustments to the arm64 vDSO in place. Also thanks to Daniel >> Mentz who also properly diagnosed the MONOTONIC_RAW issue in >> parallel while I was woking on this patchset. >> >> I wanted to send these out for some initial review. I'm >> unfortunately still chasing a third issue (related to >> inconsistencies triggered during extreme freq adjustments) I've >> uncovered on HiKey, which doesn't seem to be related to the >> previous two. >> >> As always feedback would be appreciated! >> >> thanks >> -john >> >> Cc: Thomas Gleixner >> Cc: Ingo Molnar >> Cc: Miroslav Lichvar >> Cc: Richard Cochran >> Cc: Prarit Bhargava >> Cc: Stephen Boyd >> Cc: Daniel Mentz >> >> >> John Stultz (3): >> time: Fix clock->read(clock) race around clocksource changes >> time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting >> time: Clean up CLOCK_MONOTONIC_RAW time handling >> >> Will Deacon (1): >> arm64: vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW >> >> arch/arm64/kernel/vdso.c | 5 +- >> arch/arm64/kernel/vdso/gettimeofday.S | 1 - >> include/linux/timekeeper_internal.h | 8 +-- >> kernel/time/timekeeping.c | 96 >> ++- >> 4 files changed, 66 insertions(+), 44 deletions(-) >> >> -- >> 2.7.4 >> >
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
I successfully tested these four patches on a HiKey (ARMv8) board using a 4.9 kernel. I cherry picked various commits from upstream that touched kernel/time/timekeeping.c to ensure that John's patches apply cleanly. The inconsistency-check from kselftest that previously failed because of CLOCK_MONOTONIC_RAW jumping backwards by 1 ns is now passing. Tested-by: Daniel Mentz On Tue, May 30, 2017 at 5:10 PM, Daniel Mentz wrote: > I successfully tested these four patches on a HiKey (ARMv8) board using a > 4.9 kernel. I cherry picked various commits from upstream that touched > kernel/time/timekeeping.c to ensure that John's patches apply cleanly. The > inconsistency-check from kselftest that previously failed because of > CLOCK_MONOTONIC_RAW jumping backwards by 1 ns is now passing. > > Tested-by: Daniel Mentz > > On Fri, May 26, 2017 at 8:33 PM, John Stultz wrote: >> >> As part of the Linaro Linux Kernel Functional Test (LKFT) >> effort, test failures from kselftest/timer's >> inconsistency-check were reported connected to >> CLOCK_MONOTONIC_RAW, on the HiKey platform. >> >> Digging in I found that an old issue with how sub-ns accounting >> is handled with the RAW time which was fixed long ago with the >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was >> present. >> >> Additionally, running further tests, I uncovered an issue with >> how the clocksource read function is handled when clocksources >> are changed, which can cause crashes. >> >> Both of these issues have not been uncovered in x86 based >> testing due to x86 not using vDSO to accelerate >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer >> clocksource being fast to access but incrementing slowly enough >> to get multiple reads using the same counter value (which helps >> uncover time handing issues), along with the fact that none of >> the x86 clocksources making use of the clocksource argument >> passed to the read function. >> >> This patchset addresses these two issues. >> >> Thanks so much to help from Will Deacon in getting the needed >> adjustments to the arm64 vDSO in place. Also thanks to Daniel >> Mentz who also properly diagnosed the MONOTONIC_RAW issue in >> parallel while I was woking on this patchset. >> >> I wanted to send these out for some initial review. I'm >> unfortunately still chasing a third issue (related to >> inconsistencies triggered during extreme freq adjustments) I've >> uncovered on HiKey, which doesn't seem to be related to the >> previous two. >> >> As always feedback would be appreciated! >> >> thanks >> -john >> >> Cc: Thomas Gleixner >> Cc: Ingo Molnar >> Cc: Miroslav Lichvar >> Cc: Richard Cochran >> Cc: Prarit Bhargava >> Cc: Stephen Boyd >> Cc: Daniel Mentz >> >> >> John Stultz (3): >> time: Fix clock->read(clock) race around clocksource changes >> time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting >> time: Clean up CLOCK_MONOTONIC_RAW time handling >> >> Will Deacon (1): >> arm64: vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW >> >> arch/arm64/kernel/vdso.c | 5 +- >> arch/arm64/kernel/vdso/gettimeofday.S | 1 - >> include/linux/timekeeper_internal.h | 8 +-- >> kernel/time/timekeeping.c | 96 >> ++- >> 4 files changed, 66 insertions(+), 44 deletions(-) >> >> -- >> 2.7.4 >> >
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
* John Stultzwrote: > On Sat, May 27, 2017 at 12:38 AM, Ingo Molnar wrote: > > > > * John Stultz wrote: > > > >> As part of the Linaro Linux Kernel Functional Test (LKFT) > >> effort, test failures from kselftest/timer's > >> inconsistency-check were reported connected to > >> CLOCK_MONOTONIC_RAW, on the HiKey platform. > >> > >> Digging in I found that an old issue with how sub-ns accounting > >> is handled with the RAW time which was fixed long ago with the > >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was > >> present. > >> > >> Additionally, running further tests, I uncovered an issue with > >> how the clocksource read function is handled when clocksources > >> are changed, which can cause crashes. > >> > >> Both of these issues have not been uncovered in x86 based > >> testing due to x86 not using vDSO to accelerate > >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer > >> clocksource being fast to access but incrementing slowly enough > >> to get multiple reads using the same counter value (which helps > >> uncover time handing issues), along with the fact that none of > >> the x86 clocksources making use of the clocksource argument > >> passed to the read function. > >> > >> This patchset addresses these two issues. > > > > AFAICS only the first two patches are fixes, the other two patches are > > cleanups/simplifications that resulted out of the debugging effort, right? > > Actually the first three are fixes (ARM64 still sees discontinuities > until the vDSO is fixed), the last one is a cleanup. Ok, please make it more apparent in the changelog of the third patch what user observable mis-behavior is actually fixed by it: corrupted time, non-monotonic time, or something else? Thanks, Ingo
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
* John Stultz wrote: > On Sat, May 27, 2017 at 12:38 AM, Ingo Molnar wrote: > > > > * John Stultz wrote: > > > >> As part of the Linaro Linux Kernel Functional Test (LKFT) > >> effort, test failures from kselftest/timer's > >> inconsistency-check were reported connected to > >> CLOCK_MONOTONIC_RAW, on the HiKey platform. > >> > >> Digging in I found that an old issue with how sub-ns accounting > >> is handled with the RAW time which was fixed long ago with the > >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was > >> present. > >> > >> Additionally, running further tests, I uncovered an issue with > >> how the clocksource read function is handled when clocksources > >> are changed, which can cause crashes. > >> > >> Both of these issues have not been uncovered in x86 based > >> testing due to x86 not using vDSO to accelerate > >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer > >> clocksource being fast to access but incrementing slowly enough > >> to get multiple reads using the same counter value (which helps > >> uncover time handing issues), along with the fact that none of > >> the x86 clocksources making use of the clocksource argument > >> passed to the read function. > >> > >> This patchset addresses these two issues. > > > > AFAICS only the first two patches are fixes, the other two patches are > > cleanups/simplifications that resulted out of the debugging effort, right? > > Actually the first three are fixes (ARM64 still sees discontinuities > until the vDSO is fixed), the last one is a cleanup. Ok, please make it more apparent in the changelog of the third patch what user observable mis-behavior is actually fixed by it: corrupted time, non-monotonic time, or something else? Thanks, Ingo
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
On Sat, May 27, 2017 at 12:38 AM, Ingo Molnarwrote: > > * John Stultz wrote: > >> As part of the Linaro Linux Kernel Functional Test (LKFT) >> effort, test failures from kselftest/timer's >> inconsistency-check were reported connected to >> CLOCK_MONOTONIC_RAW, on the HiKey platform. >> >> Digging in I found that an old issue with how sub-ns accounting >> is handled with the RAW time which was fixed long ago with the >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was >> present. >> >> Additionally, running further tests, I uncovered an issue with >> how the clocksource read function is handled when clocksources >> are changed, which can cause crashes. >> >> Both of these issues have not been uncovered in x86 based >> testing due to x86 not using vDSO to accelerate >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer >> clocksource being fast to access but incrementing slowly enough >> to get multiple reads using the same counter value (which helps >> uncover time handing issues), along with the fact that none of >> the x86 clocksources making use of the clocksource argument >> passed to the read function. >> >> This patchset addresses these two issues. > > AFAICS only the first two patches are fixes, the other two patches are > cleanups/simplifications that resulted out of the debugging effort, right? Actually the first three are fixes (ARM64 still sees discontinuities until the vDSO is fixed), the last one is a cleanup. thanks -john
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
On Sat, May 27, 2017 at 12:38 AM, Ingo Molnar wrote: > > * John Stultz wrote: > >> As part of the Linaro Linux Kernel Functional Test (LKFT) >> effort, test failures from kselftest/timer's >> inconsistency-check were reported connected to >> CLOCK_MONOTONIC_RAW, on the HiKey platform. >> >> Digging in I found that an old issue with how sub-ns accounting >> is handled with the RAW time which was fixed long ago with the >> CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was >> present. >> >> Additionally, running further tests, I uncovered an issue with >> how the clocksource read function is handled when clocksources >> are changed, which can cause crashes. >> >> Both of these issues have not been uncovered in x86 based >> testing due to x86 not using vDSO to accelerate >> CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer >> clocksource being fast to access but incrementing slowly enough >> to get multiple reads using the same counter value (which helps >> uncover time handing issues), along with the fact that none of >> the x86 clocksources making use of the clocksource argument >> passed to the read function. >> >> This patchset addresses these two issues. > > AFAICS only the first two patches are fixes, the other two patches are > cleanups/simplifications that resulted out of the debugging effort, right? Actually the first three are fixes (ARM64 still sees discontinuities until the vDSO is fixed), the last one is a cleanup. thanks -john
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
* John Stultzwrote: > As part of the Linaro Linux Kernel Functional Test (LKFT) > effort, test failures from kselftest/timer's > inconsistency-check were reported connected to > CLOCK_MONOTONIC_RAW, on the HiKey platform. > > Digging in I found that an old issue with how sub-ns accounting > is handled with the RAW time which was fixed long ago with the > CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was > present. > > Additionally, running further tests, I uncovered an issue with > how the clocksource read function is handled when clocksources > are changed, which can cause crashes. > > Both of these issues have not been uncovered in x86 based > testing due to x86 not using vDSO to accelerate > CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer > clocksource being fast to access but incrementing slowly enough > to get multiple reads using the same counter value (which helps > uncover time handing issues), along with the fact that none of > the x86 clocksources making use of the clocksource argument > passed to the read function. > > This patchset addresses these two issues. AFAICS only the first two patches are fixes, the other two patches are cleanups/simplifications that resulted out of the debugging effort, right? Thanks, Ingo
Re: [RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
* John Stultz wrote: > As part of the Linaro Linux Kernel Functional Test (LKFT) > effort, test failures from kselftest/timer's > inconsistency-check were reported connected to > CLOCK_MONOTONIC_RAW, on the HiKey platform. > > Digging in I found that an old issue with how sub-ns accounting > is handled with the RAW time which was fixed long ago with the > CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was > present. > > Additionally, running further tests, I uncovered an issue with > how the clocksource read function is handled when clocksources > are changed, which can cause crashes. > > Both of these issues have not been uncovered in x86 based > testing due to x86 not using vDSO to accelerate > CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer > clocksource being fast to access but incrementing slowly enough > to get multiple reads using the same counter value (which helps > uncover time handing issues), along with the fact that none of > the x86 clocksources making use of the clocksource argument > passed to the read function. > > This patchset addresses these two issues. AFAICS only the first two patches are fixes, the other two patches are cleanups/simplifications that resulted out of the debugging effort, right? Thanks, Ingo
[RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
As part of the Linaro Linux Kernel Functional Test (LKFT) effort, test failures from kselftest/timer's inconsistency-check were reported connected to CLOCK_MONOTONIC_RAW, on the HiKey platform. Digging in I found that an old issue with how sub-ns accounting is handled with the RAW time which was fixed long ago with the CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was present. Additionally, running further tests, I uncovered an issue with how the clocksource read function is handled when clocksources are changed, which can cause crashes. Both of these issues have not been uncovered in x86 based testing due to x86 not using vDSO to accelerate CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer clocksource being fast to access but incrementing slowly enough to get multiple reads using the same counter value (which helps uncover time handing issues), along with the fact that none of the x86 clocksources making use of the clocksource argument passed to the read function. This patchset addresses these two issues. Thanks so much to help from Will Deacon in getting the needed adjustments to the arm64 vDSO in place. Also thanks to Daniel Mentz who also properly diagnosed the MONOTONIC_RAW issue in parallel while I was woking on this patchset. I wanted to send these out for some initial review. I'm unfortunately still chasing a third issue (related to inconsistencies triggered during extreme freq adjustments) I've uncovered on HiKey, which doesn't seem to be related to the previous two. As always feedback would be appreciated! thanks -john Cc: Thomas GleixnerCc: Ingo Molnar Cc: Miroslav Lichvar Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Daniel Mentz John Stultz (3): time: Fix clock->read(clock) race around clocksource changes time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Clean up CLOCK_MONOTONIC_RAW time handling Will Deacon (1): arm64: vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW arch/arm64/kernel/vdso.c | 5 +- arch/arm64/kernel/vdso/gettimeofday.S | 1 - include/linux/timekeeper_internal.h | 8 +-- kernel/time/timekeeping.c | 96 ++- 4 files changed, 66 insertions(+), 44 deletions(-) -- 2.7.4
[RFC][PATCH 0/4] Fixes for two recently found timekeeping bugs
As part of the Linaro Linux Kernel Functional Test (LKFT) effort, test failures from kselftest/timer's inconsistency-check were reported connected to CLOCK_MONOTONIC_RAW, on the HiKey platform. Digging in I found that an old issue with how sub-ns accounting is handled with the RAW time which was fixed long ago with the CLOCK_MONOTONIC/REALTIME ids, but missed with RAW time, was present. Additionally, running further tests, I uncovered an issue with how the clocksource read function is handled when clocksources are changed, which can cause crashes. Both of these issues have not been uncovered in x86 based testing due to x86 not using vDSO to accelerate CLOCK_MONOTONIC_RAW, combined with the HiKey's arch_timer clocksource being fast to access but incrementing slowly enough to get multiple reads using the same counter value (which helps uncover time handing issues), along with the fact that none of the x86 clocksources making use of the clocksource argument passed to the read function. This patchset addresses these two issues. Thanks so much to help from Will Deacon in getting the needed adjustments to the arm64 vDSO in place. Also thanks to Daniel Mentz who also properly diagnosed the MONOTONIC_RAW issue in parallel while I was woking on this patchset. I wanted to send these out for some initial review. I'm unfortunately still chasing a third issue (related to inconsistencies triggered during extreme freq adjustments) I've uncovered on HiKey, which doesn't seem to be related to the previous two. As always feedback would be appreciated! thanks -john Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Miroslav Lichvar Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Daniel Mentz John Stultz (3): time: Fix clock->read(clock) race around clocksource changes time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Clean up CLOCK_MONOTONIC_RAW time handling Will Deacon (1): arm64: vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW arch/arm64/kernel/vdso.c | 5 +- arch/arm64/kernel/vdso/gettimeofday.S | 1 - include/linux/timekeeper_internal.h | 8 +-- kernel/time/timekeeping.c | 96 ++- 4 files changed, 66 insertions(+), 44 deletions(-) -- 2.7.4