Joe Perches <j...@perches.com> writes:

> On Fri, 2015-09-04 at 18:00 -0700, John Stultz wrote:
>> On Fri, Sep 4, 2015 at 5:57 PM, John Stultz <john.stu...@linaro.org> wrote:
>> > On Thu, Sep 3, 2015 at 4:26 AM, Miroslav Lichvar <mlich...@redhat.com> 
>> > wrote:
>> >> On Wed, Sep 02, 2015 at 04:16:00PM -0700, John Stultz wrote:
>> >>> On Tue, Sep 1, 2015 at 6:14 PM, Nuno Gonçalves <nuno...@gmail.com> wrote:
>> >>> > And just installing chrony from the feeds. With any kernel from 3.17
>> >>> > you'll have wrong estimates at chronyc sourcestats.
>> >>>
>> >>> Wrong estimates? Could you be more specific about what the failure
>> >>> you're seeing is here? The
>> >>>
>> >>> I installed the image above, which comes with a 4.1.6 kernel, and
>> >>> chrony seems to have gotten my BBB into ~1ms sync w/ servers over the
>> >>> internet fairly quickly (at least according to chronyc tracking).
>> >>
>> >> To see the bug with chronyd the initial offset shouldn't be very close
>> >> to zero, so it's forced to correct the offset by adjusting the
>> >> frequency in a larger step.
>> >>
>> >> I'm attaching a simple C program that prints the frequency offset
>> >> as measured between the REALTIME and MONOTONIC_RAW clocks when the
>> >> adjtimex tick is set to 9000. It should show values close to -100000
>> >> ppm and I suspect on the BBB it will be much smaller.
>> >
>> > So I spent some time on this late last night and this afternoon.
>> >
>> > It was a little odd because things don't seem totally broken, but
>> > something isn't quite right.
>> >
>> > Digging around it seems the iterative logrithmic approximation done in
>> > timekeeping_freqadjust() wasn't working right. Instead of making
>> > smaller order alternating positive and negative adjustments, it was
>> > doing strange growing adjustments for the same value that wern't large
>> > enough to actually correct things very quickly. This made it much
>> > slower to adapt to specified frequency values.
>> >
>> > The odd bit, is it seems to come down to:
>> >     tick_error = abs(tick_error);
>> >
>> > Haven't chased down why yet, but apparently abs() isn't doing what one
>> > would think when passed a s64 value.
>> 
>> Well.. chasing it down wasn't hard.. from include/linux/kernel.h:
>> /*
>>  * abs() handles unsigned and signed longs, ints, shorts and chars.  For all
>>  * input types abs() returns a signed long.
>>  * abs() should not be used for 64-bit types (s64, u64, long long) - use 
>> abs64()
>>  * for those.
>>  */
>> 
>> Ouch.
>
> Here's a little cocci script that finds more of these in:

Thanks.

Maybe we should also:

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 5582410727cb..aa7d69afdcac 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -208,6 +208,7 @@ extern int _cond_resched(void);
  */
 #define abs(x) ({                                              \
                long ret;                                       \
+               BUILD_BUG_ON(sizeof(x) > sizeof(long));         \
                if (sizeof(x) == sizeof(long)) {                \
                        long __x = (x);                         \
                        ret = (__x < 0) ? -__x : __x;           \


so that people won't make the same mistake again.
That finds bugs in
 driver/md/raid10.c
 drivers/gpu/drm/radeon/radeon_display.c
 kernel/time/clocksource.c
 kernel/time/timekeeping.c
 fs/ext4/mballoc.c
 
that your cocci scripted missed.  All "abs(x - y)".

As sector_t can be 32bit and can be 64bit, I wonder if abs_sector()
would be a good idea ... probably not.

Thoughts?

NeilBrown

Attachment: signature.asc
Description: PGP signature

Reply via email to