I wrote:
> I'm still interested in the idea of doing a manual unroll instead of
> relying on a compiler-specific feature. However, some quick testing
> didn't find an unrolling that helps much.
Hmm, actually this seems to work ok:
idx++;
size >>= 1;
if (size != 0)
{
idx++;
size >>= 1;
if (size != 0)
{
idx++;
size >>= 1;
if (size != 0)
{
idx++;
size >>= 1;
while (size != 0)
{
idx++;
size >>= 1;
}
}
}
}
(this is with the initial "if (size > (1 << ALLOC_MINBITS))" so that
we know the starting value is nonzero)
This seems to be about a wash or a small gain on x86_64, but on my
PPC Mac laptop it's very nearly identical in speed to the __builtin_clz
code. I also see a speedup on HPPA, for which my gcc is too old to
know about __builtin_clz.
Anyone want to see if they can beat that? Some testing on other
architectures would help too.
regards, tom lane
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers