I've lost track of what's up and down in this, but now that I look at
this again let me throw in my two observations of stupid gcc behaviour:
For the current code, both debian's gcc (4.7) and 5.1 partially inlines
_find_next_bit, namely the "if (!nbits || start >= nbits)" test. I know it
does it to
On 24.08.2015 01:53, Alexey Klimov wrote:
Hi Cassidy,
On Wed, Jul 29, 2015 at 11:40 PM, Cassidy Burden wrote:
I changed the test module to now set the entire array to all 0/1s and
only flip a few bits. There appears to be a performance benefit, but
it's only 2-3% better (if that). If the mai
Hi Cassidy,
On Wed, Jul 29, 2015 at 11:40 PM, Cassidy Burden wrote:
> I changed the test module to now set the entire array to all 0/1s and
> only flip a few bits. There appears to be a performance benefit, but
> it's only 2-3% better (if that). If the main benefit of the original
> patch was to
I changed the test module to now set the entire array to all 0/1s and
only flip a few bits. There appears to be a performance benefit, but
it's only 2-3% better (if that). If the main benefit of the original
patch was to save space then inlining definitely doesn't seem worth the
small gains in rea
On Вт., 2015-07-28 at 14:45 -0700, Andrew Morton wrote:
> On Wed, 29 Jul 2015 00:23:18 +0300 Yury wrote:
>
> > But I think, before/after for x86 is needed as well.
>
> That would be nice.
>
> > And why don't you consider '__always_inline__'? Simple inline is only a
> > hint and
> > guarantees
On Wed, 29 Jul 2015 00:23:18 +0300 Yury wrote:
> But I think, before/after for x86 is needed as well.
That would be nice.
> And why don't you consider '__always_inline__'? Simple inline is only a
> hint and
> guarantees nothing.
Yup. My x86_64 compiler just ignores the "inline". When I use
On 29.07.2015 00:23, Yury wrote:
On 28.07.2015 22:09, Cassidy Burden wrote:
I've tested Yury Norov's find_bit reimplementation with the
test_find_bit
module (https://lkml.org/lkml/2015/3/8/141) and measured about 35-40%
performance degradation on arm64 3.18 run with fixed CPU frequency.
The pe
On 28.07.2015 22:09, Cassidy Burden wrote:
I've tested Yury Norov's find_bit reimplementation with the test_find_bit
module (https://lkml.org/lkml/2015/3/8/141) and measured about 35-40%
performance degradation on arm64 3.18 run with fixed CPU frequency.
The performance degradation appears to be
I've tested Yury Norov's find_bit reimplementation with the test_find_bit
module (https://lkml.org/lkml/2015/3/8/141) and measured about 35-40%
performance degradation on arm64 3.18 run with fixed CPU frequency.
The performance degradation appears to be caused by the
helper function _find_next_bit
9 matches
Mail list logo