Richard O'Keefe wrote:
On 29 Oct 2008, at 8:31 am, Andrew Coppin wrote:
Hi guys.
This isn't specifically to do with Haskell, but... does anybody have
any idea roughly how fast various CPU operations are?
For example, is integer arithmetic faster or slower than
floating-point? Is addition faster or slower than multiplication? How
much slower are the trigonometric functions? etc. Does using 8-bit
integers make arithmetic any faster than using wider values?
Does anybody have a good resource for this kind of information?
lmbench3
http://sourceforge.net/projects/lmbench
Building and running it will give you some answers for your
machine.
It's hard to answer your questions as they are posed,
because of the difference between throughput and latency.
For example, you might be able to start a new multiplication
on every cycle on some machine (throughput is 1 multiply
per cycle) but the result of any one of
them might not be available for twelve cycles (latency is
12 cycles per multiply).
Rough guesses:
integer adds, subtracts, and compares are fast,
integer multiplies and divides are much slower,
slow enough that compilers go to some trouble to
do something else when multiplying or dividing
by a constant.
The speed of the trig functions depends on how much hardware
support you have and on whether your library writer favoured
speed or accuracy (especially accuracy over a wide range).
I don't think lmbench measures these, but it wouldn't be hard
to add something suitable.
Using 8-bit integers is unlikely to make your program *directly*
faster unless you are using a compiler which is smart enough to
exploit SIMD instructions without programmer-written hints. The
Intel C compiler _is_ that smart. In my code I have found it to
be extremely good at vectorising trivial loops, with no actual
effect on the run-time of my programs. However, code written
with the intention of exploiting that would be different.
The one thing that lmbench3 will tell you that will drop your
jaw and widen your eyes is the bit at the end where it tells
you about memory speed. Main memory is exceeding slow compared
with cache. This is why I said that switching over to 8-bit
integers might not make your program *directly* faster; if you
have a lot of data which you can pack tightly, so that as 32-bit
words it would not fit into L2 cache but as bytes it does, you
may well get a very useful speedup from that.
Or you may not. There is no substitute for measurement.
OK, well thanks for the info.
I'm not really interested in getting down to instruction-level
scheduling. I just want to know, at a high level, will implementing my
algorithm in integer arithmetic rather than floating-point make a
measurable difference to overall program speed.
Actually, thinking about it, I suppose the killer question is this: Does
changing between different numer representations make any measurable
performance difference at all, or are Haskell programs dominated by
cache misses?
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe