On Tue, Jul 29, 2014 at 08:56:41AM +0800, Yuyang Du wrote:
> On Mon, Jul 28, 2014 at 12:48:37PM +0200, Peter Zijlstra wrote:
> > > +static __always_inline u64 decay_load(u64 val, u64 n)
> > > +{
> > > + if (likely(val <= UINT_MAX))
> > > +         val = decay_load32(val, n);
> > > + else {
> > > +         val *= (u32)decay_load32(1 << 15, n);
> > > +         val >>= 15;
> > > + }
> > > +
> > > + return val;
> > > +}
> > 
> > Please just use mul_u64_u32_shr().
> > 
> > /me continues reading the rest of it..
> 
> Good. Since 128bit is considered in mul_u64_u32_shr, load_sum can
> afford more tasks :)

96bit actually. While for 64bit platforms it uses the 64x64->128 mult it
only uses 2 32x32->64 mults for 32bit, which isn't sufficient for 128 as
that would require 4.

It also reduces to 1 32x32->64 mult (on 32bit) in case val fits in
32bit.

Therefore its as efficient as your code, but more accurate for not
loosing bits in the full (val is bigger than 32bit) case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to