It says Criterium ran four batches of 60 samples in each (it tries to make
the jvm då garbage collection etc between each such batch).
In total, 240 samples (ie timed test runs).
The statistics are better looked up at wikipedia, but lower quartile here
means the 2.5% of the 240 samples (0.025 *
Thanks a ton for ur reply's Andy and Thomas .
I used Criterium and got results like below :
Evaluation count : 240 in 60 samples of 4 calls.
Execution time mean : 265.359848 ms
Execution time std-deviation : 25.544031 ms
Execution time lower quantile : 229.851248 ms ( 2.5%)
I have set up a single node hadoop and running my cascalog queries on it
.
Good and i get results too . Now i am using clojure.core/time to evaluate
how much time cascalog queries took for execution.
Very Strange thing is: each time i run the cascalog query , i get different
elapsed time
I have not used Cascalog, so I do not know how much variation from one run
to the next is completely normal, but there are many factors that can cause
variations in run time between runs in most computations. For example:
+ the state of L1, L2, etc. caches in the CPU memory systems
+ If files
Ya right .thanks for this info . If it is the case , how can one make
performance tests ? I really have to make some performance comparisons on
single node and multinode hadoop. Are there any other work arounds ? I want
results to be atleast somewhat close to accurate.
Or can u suggest me any
There are some simple things like: try to ensure that no one else is using
the systems being measured besides you, and even that you yourself are
doing nothing with those systems other than the runs you are trying to
measure. Measure what the load on the machines is before you start your
https://github.com/hugoduncan/criterium Does most of what you'd need for
some benchmarks.
It should be noted that neither Hadoop nor Cascalog were built for Jobs
that finish in msecs. Since you are most likely just measuring the
setup/teardown, once you push some real data through the system