On Fri, Dec 5, 2014 at 1:03 PM, Arnd Bergmann wrote:
> On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote:
>>
>> BTW this is worth applying despite the on-going discussion with Arnd
>> on a separate optimization.
>
> Agreed
>
>> On Wed, 3 Dec 2014, Nicolas Pitre wrote:
>>
>> > At least on ARM
On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote:
>
> BTW this is worth applying despite the on-going discussion with Arnd
> on a separate optimization.
Agreed
> On Wed, 3 Dec 2014, Nicolas Pitre wrote:
>
> > At least on ARM, do_div() is optimized to turn constant divisors into
> > an i
BTW this is worth applying despite the on-going discussion with Arnd
on a separate optimization.
On Wed, 3 Dec 2014, Nicolas Pitre wrote:
> At least on ARM, do_div() is optimized to turn constant divisors into
> an inline multiplication by the reciprocal value at compile time.
> However this op
On Fri, 5 Dec 2014, Arnd Bergmann wrote:
> >
> > That, too, risk overflowing.
> >
> > Let's say x_lo = 0x and x_hi = 0x. You get:
> >
> > 0x * 0x83126e97 -> 0x83126e967ced9169
> > 0x * 0x8d4fdf3b -> 0x8d4fdf3a72b020c5
> >
On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote:
> > res += (u64)x_lo * y_hi + (u64)x_hi * y_lo;
>
> That, too, risk overflowing.
>
> Let's say x_lo = 0x and x_hi = 0x. You get:
>
> 0x * 0x83126e97 -> 0x83126e967ced9169
> 0x *
On Fri, 5 Dec 2014, pang.xun...@zte.com.cn wrote:
> Nicolas,
>
> On Thursday 04 December 2014 15:23:37: Nicolas Pitre wrote:
> > Nicolas Pitre
> >
> > u64 ktime_to_us(ktime_t kt)
> > {
> >u64 ns = ktime_to_ns(kt);
> >u32 x_lo, x_hi, y_lo, y_hi;
> >u64 res, carry;
> >
> >x_hi =
On Thu, 4 Dec 2014, Arnd Bergmann wrote:
> On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote:
> > On Thu, 4 Dec 2014, Arnd Bergmann wrote:
> > Note the above code is for 32-bit architectures that support a 32x32=64
> > bit multiply instruction. And even then, what kills performances is t
On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote:
> On Thu, 4 Dec 2014, Arnd Bergmann wrote:
> Note the above code is for 32-bit architectures that support a 32x32=64
> bit multiply instruction. And even then, what kills performances is the
> inhability to efficiently deal with carry bi
On Thu, 4 Dec 2014, Arnd Bergmann wrote:
> On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote:
> > On Wed, 3 Dec 2014, Arnd Bergmann wrote:
> >
> > > On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote:
> > > > At least on ARM, do_div() is optimized to turn constant divisors into
>
On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote:
> On Wed, 3 Dec 2014, Arnd Bergmann wrote:
>
> > On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote:
> > > At least on ARM, do_div() is optimized to turn constant divisors into
> > > an inline multiplication by the reciprocal value
On Wed, 3 Dec 2014, Arnd Bergmann wrote:
> On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote:
> > At least on ARM, do_div() is optimized to turn constant divisors into
> > an inline multiplication by the reciprocal value at compile time.
> > However this optimization is missed entirely w
On Wed, 3 Dec 2014, Robert Jarzmik wrote:
> Nicolas Pitre writes:
>
> > Let ktime_divns() use do_div() inline whenever the divisor is constant
> > and small enough. This will make things like ktime_to_us() and
> > ktime_to_ms() much faster.
>
> Hi Nicolas,
>
> I suppose the "small enough" is
Nicolas Pitre writes:
> Let ktime_divns() use do_div() inline whenever the divisor is constant
> and small enough. This will make things like ktime_to_us() and
> ktime_to_ms() much faster.
Hi Nicolas,
I suppose the "small enough" is linked to the "!(div >> 32)" in your patch. Can
I have the
On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote:
> At least on ARM, do_div() is optimized to turn constant divisors into
> an inline multiplication by the reciprocal value at compile time.
> However this optimization is missed entirely whenever ktime_divns() is
> used and the slow out-o
At least on ARM, do_div() is optimized to turn constant divisors into
an inline multiplication by the reciprocal value at compile time.
However this optimization is missed entirely whenever ktime_divns() is
used and the slow out-of-line division code is used all the time.
Let ktime_divns() use do
15 matches
Mail list logo