Dear list,

during past few days I spent a lot of time trying to figure out how to write 
Criterion benchmarks, 
so that results don't get skewed by lazy evaluation. I want to benchmark 
different versions of an 
algorithm doing numerical computations on a vector. For that I need to create 
an input vector 
containing a few thousand elements. I decided to create random data, but that 
really doesn't 
matter - I could have as well use infinite lists instead of random ones.

My problem is that I am not certain if I am creating my benchmark correctly. I 
wrote a function 
that creates data like this:

dataBuild :: RandomGen g => g -> ([Double], [Double])
dataBuild gen = (take 6 $ randoms gen, take 2048 $ randoms gen)

And I create benchmark like this:

bench "Lists" $ nf L.benchThisFunction (L.dataBuild gen)

The question is how to generate data so that its evaluation won't be included 
in the benchmark. I 
already asked this question on StackOverflow ( 
http://stackoverflow.com/questions/12896235/how-to-create-data-for-criterion-benchmarks#comment17466915_12896235
 ) 
and got answer to use evaluate + force. After spending one day on  testing this 
approach I came 
to conclusion that doing this does not seem to influence results of a benchmark 
in any way (I did 
stuf like unsagePerformIO + delayThread). On the other hand I looked into 
sources of criterion 
and I see that the benchmark code is run like this: evaluate (rnf (f x))
I am a Haskell newbie and perhaps don't interpret this correctly, but to me it 
looks as though 
criterion did not evaluate the possibly non-evaluated parameter x before 
running the benchmark, 
but instead evaluates the final result. Can someone provide an explanation on 
how this exactly 
works and how should I write my benchmarks so that results are correct?

Janek

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to