Some comments on what I plan to do after my 2 week hibernation.

On Sun, 2011-13-02 at 20:27 +0300, Richard Hainsworth wrote:
> see http://shootout.alioth.debian.org/ for more info on the
> algorithms.
> 
> There are many factors that can be considered when benchmarking, 
> including IO and OS.
> 
> It seemed to me it would be a good idea to fix on some "elegant" form
> of 
> the perl6 implementation. By keeping the program the same, it will be 
> possible to track how developments implementations affect speed/memory
> size.

I am interested first in developing a generic framework around the work
already done for 'the benchmark game' (TBG*).  I will pretend that I am
starting from scratch and define a protocol for adding algorithms and
exchanging information.

I have been convinced that everything following has been done for TBG
but some of it is obscure.  The details are hidden in CVS.

I'd like to set things up so everything is fully automated.  Perl6
developers (and users :-) should be able to just run the benchmarks
in a "reasonable way" (one which halts :-) after installing the
latest rakudo release.

(A) Protocol to specify algorithm.

1. Define an algorithm and provide a reference for it.
2. Define standard inputs and implement algorithm in 2 languages.
3. Generate and verify outputs corresponding to inputs.
4. Make code, input and output available.

Details can be posted on github and descriptions on the perl6 wiki.

(B) Benchmark protocol per language.

1. Define hardware, operating system, language version.
2. Execute for a reasonable subset of inputs (some may be
too slow or fast to be interesting).
3. Generate standard metrics (see alioth).

Summaries can be posted on the perl6 wiki.

It should be possible to extend the standard metrics.  It should also be
possible to filter them in standard ways to make results clearer.

Given the above, I would just define a protocol to exchange results.
One need only specify md5 sums to verify/identify input/output -- some
of the algorithms in TBG have input and output so large that they are
truncated in the results pages.  In such cases publishing checksums (md5
is sufficient) of the results will be useful.

I'm interested in autoamating B1 for other purposes.

[*] Personally, I have nothing against 'shootout' but it does no harm to
respect the wishes of the current maintainer of TBG on alioth.

-- 
--gh


Reply via email to