Greg Stark <gsst...@mit.edu> writes:
> Well it was a bit of a pain but I filled an array with (1/1000 scaled
> down) values and then permuted them. I also went ahead and set the
> low-order bits to random values since the lookup table based algorithm
> might be affected by it.

> The results are a bit disappointing on my machine, only the CLZ and
> lookup table come out significantly ahead:

>                  clz 1.530s
>         lookup table 1.720s
>           float hack 4.424s
>             unrolled 5.280s
>               normal 5.369s

It strikes me that we could assume that the values are < 64K and hence
drop the first case in the lookup table code.  I've added that variant
and get these results on my machines:

x86_64 (Xeon):

                 clz 15.357s
        lookup table 16.582s
  small lookup table 16.705s
          float hack 25.138s
            unrolled 64.630s
              normal 79.025s

PPC:

                 clz 3.842s
        lookup table 7.298s
  small lookup table 8.799s
          float hack 19.418s
            unrolled 7.656s
              normal 8.949s

HPPA:
                 clz (n/a)
        lookup table 11.515s
  small lookup table 10.803s
          float hack 16.502s
            unrolled 17.632s
              normal 19.754s

Not sure why the "small lookup table" variant actually seems slower
than the original on two of these machines; it can hardly be slower in
reality since it's strictly less code.  Maybe some weird code-alignment
issue?

It also seems weird that the x86_64 is now showing a much bigger gap
between clz and "normal" than before.  I don't see how branch prediction
would do much for the "normal" code.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to