Greetings,

I just hunted down a time related bug which caused the Linux internal
xtime to drift away from the precise hardware clock provided by the
TOD clock found in the s390 architecture.

After a long search I came along this lovely piece of code in
kernel/time/timekeeping.c:

#ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
static inline void old_vsyscall_fixup(struct timekeeper *tk)

        s64 remainder;

        /*
        * Store only full nanoseconds into xtime_nsec after rounding
        * it up and add the remainder to the error difference.
        * XXX - This is necessary to avoid small 1ns inconsistnecies caused
        * by truncating the remainder in vsyscalls. However, it causes
        * additional work to be done in timekeeping_adjust(). Once
        * the vsyscall implementations are converted to use xtime_nsec
        * (shifted nanoseconds), and CONFIG_GENERIC_TIME_VSYSCALL_OLD
        * users are removed, this can be killed.
        */
        remainder = tk->xtime_nsec & ((1ULL << tk->shift) - 1);
        tk->xtime_nsec -= remainder;
        tk->xtime_nsec += 1ULL << tk->shift;
        tk->ntp_error += remainder << tk->ntp_error_shift;

}
#else
#define old_vsyscall_fixup(tk)
#endif

The highly precise result of our TOD clock source ends up in
tk->xtime_sec / tk->xtime_nsec where old_vsyscall_fixup just rounds
it up to the next nano-second (booo). To add insult to injury an
incorrect delta gets added to ntp_error, xtime has been forwarded by
((1ULL << tk->shift) - (tk->xtime_nsec & ((1ULL << tk->shift) - 1)))
and not set back by (tk->xtime_nsec & ((1ULL << tk->shift) - 1)).
xtime is too fast by one nano-second per tick. To verify that this
is indeed the problem I removed the line that adds the nano-second
to xtime_nsec and voila the clocks are in sync.

A possible patch to fix this would be:

--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1347,6 +1347,7 @@ static inline void old_vsyscall_fixup(struct timekeeper *t
k)
        tk->xtime_nsec -= remainder;
        tk->xtime_nsec += 1ULL << tk->shift;
        tk->ntp_error += remainder << tk->ntp_error_shift;
+       tk->ntp_error -= (1ULL << tk->shift) << tk->ntp_error_shift;
 
 }
 #else

But that has the downside that it creates a negative ntp_error that
can only be corrected with an adjustment of tk->mult which takes a
long time.

The fix I am going to use is to convert s390 to GENERIC_TIME_VSYSCALL,
you might want to think about doing that for powerpc and ia64 as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to