Argh, this is related to the core.cpuid bug.  I really don't feel like kludging a workaround into these benchmarks.  Can you please use 32 instead?  Since everything's in single precision, these benchmarks won't have stack alignment artifacts.

On 2/26/2011 3:01 PM, Russel Winder wrote:
On Sat, 2011-02-26 at 14:43 -0500, David Simcha wrote:

I feel like the build process for the benchmarks is so trivial that no
formal process is required.    Besides, I know nothing about build
tools because, for anything I write, I tend to keep the build process
so simple and stupid that nothing more than a 2-line script or the
builtin build system for Code::Blocks is required.  
OK, I used SCons as it is the right tool for dealing with directories of
independent compilations.  

For std.parallelism, as you appear to have figured out, you need to
put it in your Phobos source folder, edit the makefile and recompile
Phobos.  Each benchmark is completely self-contained.  The way I run
them is to open them in Code::Blocks (or whatever IDE happens to be
available) and hit "build and run".  dmd -O -inline -release
someBenchmark.d && ./someBenchmark would also work fine.
I don't go anywhere near recompiling Phobos, I just use what comes in
the distribution.  I keep extra stuff in ~/lib/D and use the DPATH,
LIBPATH and LIBS capabilities.

        |> scons
        /usr/bin/python /home/Checkouts/Mercurial/SCons/bootstrap/src/script/scons.py
        scons: Reading SConscript files ...
        scons: done reading SConscript files.
        scons: Building targets ...
        dmd -I. -I/home/users/russel/lib/D -m64 -c -ofeuclidean.o euclidean.d
        gcc -o euclidean euclidean.o -L/home/users/russel/lib.Linux.x86_64 -L/home/users/russel/lib.Linux.x86_64/DMD2/lib64 -lparallelism -lphobos2 -lpthread -lm -lrt
        dmd -I. -I/home/users/russel/lib/D -m64 -c -ofmatrixInversion.o matrixInversion.d
        gcc -o matrixInversion matrixInversion.o -L/home/users/russel/lib.Linux.x86_64 -L/home/users/russel/lib.Linux.x86_64/DMD2/lib64 -lparallelism -lphobos2 -lpthread -lm -lrt
        dmd -I. -I/home/users/russel/lib/D -m64 -c -ofmillionSqrt.o millionSqrt.d
        gcc -o millionSqrt millionSqrt.o -L/home/users/russel/lib.Linux.x86_64 -L/home/users/russel/lib.Linux.x86_64/DMD2/lib64 -lparallelism -lphobos2 -lpthread -lm -lrt
        dmd -I. -I/home/users/russel/lib/D -m64 -c -ofparallelSort.o parallelSort.d
        gcc -o parallelSort parallelSort.o -L/home/users/russel/lib.Linux.x86_64 -L/home/users/russel/lib.Linux.x86_64/DMD2/lib64 -lparallelism -lphobos2 -lpthread -lm -lrt
        dmd -I. -I/home/users/russel/lib/D -m64 -c -ofpipelining.o pipelining.d
        gcc -o pipelining pipelining.o -L/home/users/russel/lib.Linux.x86_64 -L/home/users/russel/lib.Linux.x86_64/DMD2/lib64 -lparallelism -lphobos2 -lpthread -lm -lrt
        scons: done building targets.


        |> for f in euclidean matrixInversion millionSqrt parallelSort pipelining ; do echo "==========================  $f  ==============================" && $f ; done
        ==========================  euclidean  ==============================
        Serial reduce:  5182 milliseconds.
        Parallel reduce with 1 cores:  5183 milliseconds.
        ==========================  matrixInversion  ==============================
        Inverted a 256 x 256 matrix serially in 63 milliseconds.
        Inverted a 256 x 256 matrix using 1 cores in 64 milliseconds.
        ==========================  millionSqrt  ==============================
        Parallel benchmarks being done with 1 cores.
        Did serial millionSqrt in 975 milliseconds.
        Did parallel foreach millionSqrt in 1263 milliseconds.
        Did parallel map millionSqrt in 979 milliseconds.
        ==========================  parallelSort  ==============================
        Serial quick sort:  6256 milliseconds.
        Parallel quick sort:  6317 milliseconds.
        ==========================  pipelining  ==============================
        Did serial string -> float, euclid in 2606 milliseconds.
        Did parallel string -> float, euclid with 1 cores in 2576 milliseconds.


So we have the problem that the code is failing to observe that I have 8
cores on my machine :-((

_______________________________________________ phobos mailing list [email protected] http://lists.puremagic.com/mailman/listinfo/phobos

_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Reply via email to