Re: D compiler benchmarks

2009-03-09 Thread Robert Clipsham

bearophile wrote:

- Having a C or C++ (or something better, where necessary) baseline reference 
can be very useful to know how much far is the D code from the fastest non-ASM 
versions.


This seems to be quite a popular request, I'll do this at some point


- you can improve the graphs in the dbench.octarineparrot.com page so they can 
be read better.


How would you like them improved? I just copied and pasted some CSS to 
generate them, it can easily be tweaked to be easier to read.



- Tune all your tests so they run for 6-15 seconds or more. If they run for 
less than 3 seconds there's too much measure noise.


Sounds like a good plan, I'll do that next time I run them


- Taking the average of three runs isn't that good, but this is a tricky 
topic... Take the minimum for now.


Someone has already pointed this out, and I plan to do it next time. By 
minimum do you mean the fastest or slowest result?



- With my browser the label binarytrees2 is misplaced.


All the benchmarks on the right are slightly misplaced, I can't figure 
it out. I'll try and tweak it so it fits better one I get your input on 
how to improve the graphs.



- what's the differece between nsievebits2 and nsievebits? And nbody and 
nbody2? Reading the source is good, but a small note too is good.


As you probably know, the tests are just from the shootout. The number 
is the version number of the test, I picked the tests that performed the 
best when there was more than one.



- Both in GDC and LDC it's positive to add what backends they use (for example: 
ldc version: r1050 using LLVM 2.5).


I'll do that with the next update.


- Note that currently LDC goes up only to -O3.


I thought 4/5 introduced linker optimisations? Either way it doesn't 
matter, -O5 will perform all the optimisations available with the 
current version of ldc.


Re: D compiler benchmarks

2009-03-09 Thread Andrei Alexandrescu

Robert Clipsham wrote:
Someone has already pointed this out, and I plan to do it next time. By 
minimum do you mean the fastest or slowest result?


Fastest result.

Andrei


Re: D compiler benchmarks

2009-03-09 Thread bearophile
Robert Clipsham:
 How would you like them improved?

In any way that lets me see them well and not makes Tufte cry.


 By minimum do you mean the fastest or slowest result?

Where do the shorter and longer timings come from? Think a bit about that.
(The answer is minimum, but you have to know why).

Bye,
bearophile


Re: D compiler benchmarks

2009-03-08 Thread Robert Clipsham

Georg Wrede wrote:

Robert Clipsham wrote:

Hi all,

I have set up some benchmarks for dmd, ldc and gdc at 
http://dbench.octarineparrot.com/.


There are currently only 6 tests all from 
http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is 
not great enough to port the others to tango (I've chosen tango as ldc 
does not support phobos currently, so it make sense to choose tango as 
all compilers support it). If you would like to contribute new tests 
or improve on the current ones let me know and I'll include them next 
time I run them.


All source code can be found at 
http://hg.octarineparrot.com/dbench/file/tip.


Let me know if you have any ideas for how I can improve the 
benchmarks, I currently plan to add compile times, size of the final 
executable and memory usage (if anyone knows an easy way to get the 
memory usage of a process in D, let me know :D).


The first run should not be included in the average.



Could you explain your reasoning for this? I can't see why it shouldn't 
be included personally.


Re: D compiler benchmarks

2009-03-08 Thread Jason House
Robert Clipsham Wrote:

 Hi all,
 
 I have set up some benchmarks for dmd, ldc and gdc at 
 http://dbench.octarineparrot.com/.
 
 There are currently only 6 tests all from 
 http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is 
 not great enough to port the others to tango (I've chosen tango as ldc 
 does not support phobos currently, so it make sense to choose tango as 
 all compilers support it). If you would like to contribute new tests or 
 improve on the current ones let me know and I'll include them next time 
 I run them.
 
 All source code can be found at 
 http://hg.octarineparrot.com/dbench/file/tip.
 
 Let me know if you have any ideas for how I can improve the benchmarks, 
 I currently plan to add compile times, size of the final executable and 
 memory usage (if anyone knows an easy way to get the memory usage of a 
 process in D, let me know :D).

I don't think it's proper to limit solutions to either Phobos or Tango, or 
either D1 or D2. Why not include all mixes of standard libraries, compilers, 
and major D versions?

I've always heard Tango is faster... Let's see proof!
Similarly, D2 aims to do multithreading better. I'd love to see performance and 
code differences between D1 and D2.


Re: D compiler benchmarks

2009-03-08 Thread Robert Clipsham

Jason House wrote:

I don't think it's proper to limit solutions to either Phobos or Tango, or 
either D1 or D2. Why not include all mixes of standard libraries, compilers, 
and major D versions?

I've always heard Tango is faster... Let's see proof!
Similarly, D2 aims to do multithreading better. I'd love to see performance and 
code differences between D1 and D2.


These benchmarks are designed purely to test the compilers, not the 
libraries. I agree that it might be interesting to see benchmarks 
between tango and phobos, I might set some up at some point. I know 
there are already some benchmarks up for XML performance of 
tango/phobos/other xml libraries at http://dotnot.org/, as well as some 
tests showing performance of the GC at 
http://www.dsource.org/projects/tango/wiki/GCBenchmark. Neither of these 
are up to date or test the full extent of the libraries, but do show 
some difference in performance. As I stated in my post I chose tango 
purely because ldc does not currently support phobos. The choice of 
library should not affect performance as all benchmarks use stdc for any 
external functions.


I will not be setting up benchmarks for D2 yet, as there is currently 
only one D2 compiler and it is in alpha. When there are multiple D2 
compilers, I will set up some more benchmarks for them. Similarly when 
D2 moves out of alpha I will happily put it against D1 if there is demand.




Re: D compiler benchmarks

2009-03-08 Thread Frank Benoit
Robert Clipsham schrieb:
 Georg Wrede wrote:
 Robert Clipsham wrote:
 Hi all,

 I have set up some benchmarks for dmd, ldc and gdc at
 http://dbench.octarineparrot.com/.

 There are currently only 6 tests all from
 http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos
 is not great enough to port the others to tango (I've chosen tango as
 ldc does not support phobos currently, so it make sense to choose
 tango as all compilers support it). If you would like to contribute
 new tests or improve on the current ones let me know and I'll include
 them next time I run them.

 All source code can be found at
 http://hg.octarineparrot.com/dbench/file/tip.

 Let me know if you have any ideas for how I can improve the
 benchmarks, I currently plan to add compile times, size of the final
 executable and memory usage (if anyone knows an easy way to get the
 memory usage of a process in D, let me know :D).

 The first run should not be included in the average.

 
 Could you explain your reasoning for this? I can't see why it shouldn't
 be included personally.

fill up of disk and memory caches. That is why the first run has a
different timing to the other runs.



Re: D compiler benchmarks

2009-03-08 Thread Georg Wrede

Robert Clipsham wrote:

Georg Wrede wrote:

Robert Clipsham wrote:

Hi all,

I have set up some benchmarks for dmd, ldc and gdc at 
http://dbench.octarineparrot.com/.


There are currently only 6 tests all from 
http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos 
is not great enough to port the others to tango (I've chosen tango as 
ldc does not support phobos currently, so it make sense to choose 
tango as all compilers support it). If you would like to contribute 
new tests or improve on the current ones let me know and I'll include 
them next time I run them.


All source code can be found at 
http://hg.octarineparrot.com/dbench/file/tip.


Let me know if you have any ideas for how I can improve the 
benchmarks, I currently plan to add compile times, size of the final 
executable and memory usage (if anyone knows an easy way to get the 
memory usage of a process in D, let me know :D).


The first run should not be included in the average.


Could you explain your reasoning for this? I can't see why it shouldn't 
be included personally.


Suppose you have run the same program very recently before the test. 
Then the executable will be in memory already, any other files it may 
want to access are in memory too.


This makes execution much faster than if it were the first time ever 
this program is run.


If things were deterministic, then you wouldn't run several times and 
average the results, right?


Re: D compiler benchmarks

2009-03-08 Thread Bill Baxter
On Mon, Mar 9, 2009 at 3:15 AM, Georg Wrede georg.wr...@iki.fi wrote:
 Robert Clipsham wrote:

 Georg Wrede wrote:

 Robert Clipsham wrote:

 Hi all,

 I have set up some benchmarks for dmd, ldc and gdc at
 http://dbench.octarineparrot.com/.

 There are currently only 6 tests all from
 http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is not
 great enough to port the others to tango (I've chosen tango as ldc does not
 support phobos currently, so it make sense to choose tango as all compilers
 support it). If you would like to contribute new tests or improve on the
 current ones let me know and I'll include them next time I run them.

 All source code can be found at
 http://hg.octarineparrot.com/dbench/file/tip.

 Let me know if you have any ideas for how I can improve the benchmarks,
 I currently plan to add compile times, size of the final executable and
 memory usage (if anyone knows an easy way to get the memory usage of a
 process in D, let me know :D).

 The first run should not be included in the average.

 Could you explain your reasoning for this? I can't see why it shouldn't be
 included personally.

 Suppose you have run the same program very recently before the test. Then
 the executable will be in memory already, any other files it may want to
 access are in memory too.

 This makes execution much faster than if it were the first time ever this
 program is run.

 If things were deterministic, then you wouldn't run several times and
 average the results, right?

Also I think standard practice for benchmarks is not to average but to
take the minimum time.
To the extent that things are not deterministic it is generally
because of factors outside of your program's control -- virtual memory
page fault kicking in, some other process stealing cycles, etc.  Or
put another way, there is no way for the measured run time of your
program to come out artificially too low, but there are lots of ways
it could come out too high.   The reason you average measurements in
other scenarios is because of an expectation that the measurements
form a normal distribution around the true value.  That is not the
case for measurements of computer program running times.  Measurements
will basically always be higher than the true intrinsic run-time for
your program.

--bb


Re: D compiler benchmarks

2009-03-08 Thread Robert Clipsham

Bill Baxter wrote:

On Mon, Mar 9, 2009 at 3:15 AM, Georg Wrede georg.wr...@iki.fi wrote:

Robert Clipsham wrote:

Georg Wrede wrote:

Robert Clipsham wrote:

Hi all,

I have set up some benchmarks for dmd, ldc and gdc at
http://dbench.octarineparrot.com/.

There are currently only 6 tests all from
http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is not
great enough to port the others to tango (I've chosen tango as ldc does not
support phobos currently, so it make sense to choose tango as all compilers
support it). If you would like to contribute new tests or improve on the
current ones let me know and I'll include them next time I run them.

All source code can be found at
http://hg.octarineparrot.com/dbench/file/tip.

Let me know if you have any ideas for how I can improve the benchmarks,
I currently plan to add compile times, size of the final executable and
memory usage (if anyone knows an easy way to get the memory usage of a
process in D, let me know :D).

The first run should not be included in the average.

Could you explain your reasoning for this? I can't see why it shouldn't be
included personally.

Suppose you have run the same program very recently before the test. Then
the executable will be in memory already, any other files it may want to
access are in memory too.

This makes execution much faster than if it were the first time ever this
program is run.

If things were deterministic, then you wouldn't run several times and
average the results, right?


Also I think standard practice for benchmarks is not to average but to
take the minimum time.
To the extent that things are not deterministic it is generally
because of factors outside of your program's control -- virtual memory
page fault kicking in, some other process stealing cycles, etc.  Or
put another way, there is no way for the measured run time of your
program to come out artificially too low, but there are lots of ways
it could come out too high.   The reason you average measurements in
other scenarios is because of an expectation that the measurements
form a normal distribution around the true value.  That is not the
case for measurements of computer program running times.  Measurements
will basically always be higher than the true intrinsic run-time for
your program.

--bb


By minimum time, do you mean the fastest time or the slowest time?


Re: D compiler benchmarks

2009-03-08 Thread Robert Clipsham

Georg Wrede wrote:
Suppose you have run the same program very recently before the test. 
Then the executable will be in memory already, any other files it may 
want to access are in memory too.


This makes execution much faster than if it were the first time ever 
this program is run.


If things were deterministic, then you wouldn't run several times and 
average the results, right?


Ok, I will rerun the tests later today and disregard the first test. I 
may also take the minimum value rather than taking an average (thanks to 
Bill Baxter for this idea).


Re: D compiler benchmarks

2009-03-08 Thread Isaac Gouy
Robert Clipsham Wrote:

 Georg Wrede wrote:
  Suppose you have run the same program very recently before the test. 
  Then the executable will be in memory already, any other files it may 
  want to access are in memory too.
  
  This makes execution much faster than if it were the first time ever 
  this program is run.
  
  If things were deterministic, then you wouldn't run several times and 
  average the results, right?
 
 Ok, I will rerun the tests later today and disregard the first test. I 
 may also take the minimum value rather than taking an average (thanks to 
 Bill Baxter for this idea).


As you're re-inventing functionality that's in the benchmarks game measurement 
scripts, let me suggest that there are 2 phases involved:

1) record measurements
2) analyze measurements

As long as you keep the measurements in the order they were made in and keep 
the measurements for each different configuration in their own file, you can 
decide to do different selections from those measurements at some later date.

You can throw away the first measurement or not, you can take the fastest or 
the median, you can ... without doing new measurements.

As you are only trying to measure a couple of language implementations, measure 
them across a dozen different input values rather than one or two - leaving the 
computer churning overnight will help keep your home warm :-)


D compiler benchmarks

2009-03-07 Thread Robert Clipsham

Hi all,

I have set up some benchmarks for dmd, ldc and gdc at 
http://dbench.octarineparrot.com/.


There are currently only 6 tests all from 
http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is 
not great enough to port the others to tango (I've chosen tango as ldc 
does not support phobos currently, so it make sense to choose tango as 
all compilers support it). If you would like to contribute new tests or 
improve on the current ones let me know and I'll include them next time 
I run them.


All source code can be found at 
http://hg.octarineparrot.com/dbench/file/tip.


Let me know if you have any ideas for how I can improve the benchmarks, 
I currently plan to add compile times, size of the final executable and 
memory usage (if anyone knows an easy way to get the memory usage of a 
process in D, let me know :D).


Re: D compiler benchmarks

2009-03-07 Thread The Anh Tran

1. Could you add some bench that Alioth currently run on q6600 ubuntu?
2. A gnu c++ for a reference would be great.
I'm very eager to port C++ entries to D :)

Robert Clipsham wrote:

Hi all,

I have set up some benchmarks for dmd, ldc and gdc at 
http://dbench.octarineparrot.com/.


There are currently only 6 tests all from 
http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is 
not great enough to port the others to tango (I've chosen tango as ldc 
does not support phobos currently, so it make sense to choose tango as 
all compilers support it). If you would like to contribute new tests or 
improve on the current ones let me know and I'll include them next time 
I run them.


All source code can be found at 
http://hg.octarineparrot.com/dbench/file/tip.


Let me know if you have any ideas for how I can improve the benchmarks, 
I currently plan to add compile times, size of the final executable and 
memory usage (if anyone knows an easy way to get the memory usage of a 
process in D, let me know :D).


Re: D compiler benchmarks

2009-03-07 Thread The Anh Tran

3. Do you use multithread or single thread?


Re: D compiler benchmarks

2009-03-07 Thread Daniel Keep


Robert Clipsham wrote:
 ...
 
 (if anyone knows an easy way to get the memory usage of a
 process in D, let me know :D).

There's a way to do it in Phobos;

 import gc=std.gc;
 import gcstats;

 void main()
 {
 GCStats gcst;
 gc.getStats(gcst);
 }


I went hunting through Tango, and it looks like it has the same method;
it just isn't exposed.  Probably because of this comment:

// NOTE: This routine is experimental.  The stats or function name may
//   change before it is made officially available.

None the less, if you want it now, you could try adding this to your
code somewhere:

 struct GCStats
 {
 size_t poolsize;// total size of pool
 size_t usedsize;// bytes allocated
 size_t freeblocks;  // number of blocks marked FREE
 size_t freelistsize;// total of memory on free lists
 size_t pageblocks;  // number of blocks marked PAGE
 }

 extern(C) GCStats gc_stats();

You should probably whack this in a module so you can replace it easily
if and when it changes.


  -- Daniel


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

The Anh Tran wrote:

1. Could you add some bench that Alioth currently run on q6600 ubuntu?
2. A gnu c++ for a reference would be great.
I'm very eager to port C++ entries to D :)


1. All the benchmarks currently up are just tango ports of tests from 
alioth, if that's what you mean?
2. I wasn't planning on adding in C,C++ etc benchmarks as then it will 
just become a clone of the shootout. I don't mind adding a reference in 
for each test from C/C++ if there is enough demand, I would rather avoid 
it and make it purely for D benchmarks if possible.


Please feel free to port benchmarks to D/tango, I'll be more than happy 
to incorporate them into the suite (which is currently fairly minimal).


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

The Anh Tran wrote:

3. Do you use multithread or single thread?


I'm not sure what you mean here. All the current benchmarks are single 
threaded as the multi threaded benchmarks use std.thread, and my 
knowledge of phobos is not good enough to port them. If you mean the 
machine itself, it does support multithreading, so tests could benefit 
from that.


Re: D compiler benchmarks

2009-03-07 Thread Daniel Keep


Robert Clipsham wrote:
 I was thinking more a way of getting the memory usage using run.d (the
 app I'm using for benchmarking, it's in the repository if you're
 interested). It's rather difficult to get the memory usage in run.d so I
 can put it straight into the stats page if it's being got from within
 benchmark, and doing this would probably affect the benchmark some
 amount which I would ideally like to avoid.

Ah.

From the alioth FAQ:

 How did you measure memory use?

 By sampling GTop proc_mem for the program and it's child processes
 every 0.2 seconds. Obviously those measurements are unlikely to be
 reliable for programs that run for less than 0.2 seconds.

Probably best to ensure this sampling thread is running on a different
hardware thread to the tested program...

  -- Daniel


Re: D compiler benchmarks

2009-03-07 Thread The Anh Tran

Could you provide D compiler version that you're using?
Download-able compiler packages if you don't mind :D. I'm spoiled by EasyD.


Re: D compiler benchmarks

2009-03-07 Thread Daniel Keep

Incidentally, this might be of assistance:

http://shootout.alioth.debian.org/u32q/faq.php#measurementscripts

  -- Daniel


Re: D compiler benchmarks

2009-03-07 Thread The Anh Tran

Robert Clipsham wrote:

The Anh Tran wrote:

3. Do you use multithread or single thread?


I'm not sure what you mean here. All the current benchmarks are single 
threaded as the multi threaded benchmarks use std.thread, and my 
knowledge of phobos is not good enough to port them. If you mean the 
machine itself, it does support multithreading, so tests could benefit 
from that.


Sorry, my english is bad.
Does your bench split into single/multi thread categories like Alioth's:
http://shootout.alioth.debian.org/u64q/
http://shootout.alioth.debian.org/u64/

They use affinity to emulate single core bench. But i think we can add a 
number to command line for that. Ie:

fankuch 10 4 // run fankuch bench with 4 threads, array size is 10


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

The Anh Tran wrote:

Could you provide D compiler version that you're using?
Download-able compiler packages if you don't mind :D. I'm spoiled by EasyD.
All compiler versions are given on the page. Gdc is a tango package, dmd 
is dmd 1.041 and a tango package, ldc is from hg/tango from svn.


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

The Anh Tran wrote:

Robert Clipsham wrote:

The Anh Tran wrote:

3. Do you use multithread or single thread?


I'm not sure what you mean here. All the current benchmarks are single 
threaded as the multi threaded benchmarks use std.thread, and my 
knowledge of phobos is not good enough to port them. If you mean the 
machine itself, it does support multithreading, so tests could benefit 
from that.


Sorry, my english is bad.
Does your bench split into single/multi thread categories like Alioth's:
http://shootout.alioth.debian.org/u64q/
http://shootout.alioth.debian.org/u64/

They use affinity to emulate single core bench. But i think we can add a 
number to command line for that. Ie:

fankuch 10 4 // run fankuch bench with 4 threads, array size is 10


No, I don't do that, tests just run as they come.


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

Daniel Keep wrote:


From the alioth FAQ:


How did you measure memory use?

By sampling GTop proc_mem for the program and it's child processes
every 0.2 seconds. Obviously those measurements are unlikely to be
reliable for programs that run for less than 0.2 seconds.


I had read this, but that's as far as I got with it!



Probably best to ensure this sampling thread is running on a different
hardware thread to the tested program...

  -- Daniel


It should be running in an entirely different process, but that depends 
on how tango.sys.Process deals with processes.


Re: D compiler benchmarks

2009-03-07 Thread Robert Clipsham

Daniel Keep wrote:

Incidentally, this might be of assistance:

http://shootout.alioth.debian.org/u32q/faq.php#measurementscripts

  -- Daniel


Thanks! I've actually already downloaded these, but being me completely 
overlooked them. If I remember correctly they were python scripts, and 
my current testing app is in D - 
http://hg.octarineparrot.com/dbench/file/tip/run.d