On 10 January 2018 at 09:15, Crt Mori <[email protected]> wrote:
> On 9 January 2018 at 20:23, Joe Perches <[email protected]> wrote:
>> On Tue, 2018-01-09 at 16:18 +0100, Crt Mori wrote:
>>> There is no option to perform 64bit integer sqrt on 32bit platform.
>>> Added stronger typed int_sqrt64 enables the 64bit calculations to
>>> be performed on 32bit platforms. Using same algorithm as int_sqrt()
>>> with strong typing provides enough precision also on 32bit platforms,
>>> but it sacrifices some performance.
>> []
>>> diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c
>> []
>>> @@ -36,3 +37,34 @@ unsigned long int_sqrt(unsigned long x)
>>>       return y;
>>>  }
>>>  EXPORT_SYMBOL(int_sqrt);
>>> +
>>> +#if BITS_PER_LONG < 64
>>> +/**
>>> + * int_sqrt64 - strongly typed int_sqrt function when minimum 64 bit input
>>> + * is expected.
>>> + * @x: 64bit integer of which to calculate the sqrt
>>> + */
>>> +u32 int_sqrt64(u64 x)
>>> +{
>>> +     u64 b, m;
>>> +     u32 y = 0;
>>> +
>>> +     if (x <= 1)
>>> +             return x;
>>
>> I think this should instead be:
>>
>>         if (x <= INT_MAX)
>>                 return int_sqrt((int)x);
>>
>> to reduce the loop cost below when the
>> value is small enough.
>>
>
> In existing int_sqrt its only 1 and I assume that is more to protect
> from loop execution with 0 or 1. Since there is no difference (except
> fls64) with int_sqrt I assume there is no need to call it to avoid
> loop?
>

Nevermind, I see what you mean (should have thought longer before I
written). The cost of below loop is because of 64bit calculation is
not native on 32bit and we could just use 32bit calculation in that
loop. Will send v13 with a fix for this.

>>> +
>>> +     m = 1ULL << (fls64(x) & ~1ULL);
>>> +     while (m != 0) {
>>> +             b = y + m;
>>> +             y >>= 1;
>>> +
>>> +             if (x >= b) {
>>> +                     x -= b;
>>> +                     y += m;
>>> +             }
>>> +             m >>= 2;
>>> +     }
>>> +
>>> +     return y;
>>> +}
>>> +EXPORT_SYMBOL(int_sqrt64);
>>> +#endif

Reply via email to