On Sat, Feb 12, 2005 at 11:36:27AM -0800, Steven M. Schultz wrote:
> On Sat, 12 Feb 2005, Roine Gustafsson wrote:
> > It's an urban myth that 64bit is faster than 32bit, like people assume
> > a 2GHz computer is twice as fast as a 1GHz computer.
>
> It's also an urban myth that 64bit is slower than 32bit :)
Not automatically, but on MacOS you can easily run into trouble.
Anecdote: I took a benchmark of my own (mostly loops of integer math
and bitwise logical operations) and put it on a G5 XServe. This code
made use of some 64-bit integers. I compiled it with gcc generically
and it ran quite nicely. As I started adding flags to enable the use
of the G5's 64-bit instructions, it got slower. The more optimization
flags I added, the worse it got.
Here's why; consider this trivial bit of code to add two 64-bit
integers:
#include <stdint.h>
uint64_t foo(uint64_t const x,uint64_t const y) { return x + y; }
First a generic 32-bit compilation and the resulting assembly:
gcc -O3
_foo:
addc r4,r4,r6
adde r3,r3,r5
blr
Now let's turn on all of the options for G5 support and see what
happens:
gcc -O3 -mcpu=970 -mtune=970 -mpowerpc64 -mpowerpc-gpopt -force_cpusubtype_ALL
_foo:
stw r3,-32(r1)
stw r4,-28(r1)
stw r5,-24(r1)
stw r6,-20(r1)
ld r4,-32(r1)
ld r3,-24(r1)
add r2,r4,r3
mr r4,r2
srdi r3,r2,32
blr
GAH! The function's arguments are already in registers, but this code
writes them to RAM and then reads them back before using them. I
don't know enough about the G5's pipelining and cache performance to
say how bad this will be, but it's certainly going to be noticably
slower than the non-G5 version.
My guess is that since the current MacOS has no 64-bit support in the
ABI, all function arguments get broken into 32-bit values before being
passed. gcc wants to get the values into 64-bit registers so that it
can do a single add instruction, but for whatever reason it believes
the best approach to doing this is to use 32-bit stores and 64-bit
loads.
So a combination of the limits in the Apple ABI, and gcc's crazy
implementation, lead to the resulting code being much worse when
the 64-bit optimizations are turned on.
This is with Apple's supplied/modified gcc 3.3. I also tried a
vanilla gcc 3.4.1, but it generates even more instructions for the
64-bit case and additionally seemed to have some incompatibilities
with Apple's gcc wrt structure field layout. Perhaps the commercial
compilers will do better.
Hopefully the next MacOS with a 64-bit userland will fix all this.
-Dave Dodge
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Mjpeg-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mjpeg-users