On 5/31/14, 7:10 AM, Russel Winder via Digitalmars-d wrote:
On Sat, 2014-05-31 at 07:02 -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
On 5/30/14, 11:36 PM, Russel Winder via Digitalmars-d wrote:
As well as the average (mean), you must provide standard deviation and
degrees of freedom so that a proper error analysis and t-tests are
feasible. Or put it another way: even if you quote a mean with knowing
how many in the sample and what the spread is you cannot judge the error
and so cannot make deductions or inferences.

No. Elapsed time in a benchmark does not follow a Student or Gaussian
distribution. Use the mode or (better) the minimum. -- Andrei

We almost certainly need to unpack that more. I agree that behind my
comment was an implicit assumption of a normal distribution of results.
This is an easy assumption to make even if it is wrong. So is it
provably wrong? What is the distribution? If we know that then there is
knowledge of the parameters which then allow for statistical inference
and deduction.

Well there's quantization noise which has uniform distribution. Then all other sources of noise are additive (no noise may make code run faster). So I speculate that the pdf is a half Gaussian mixed with a uniform distribution. Taking the mode (which is very close to the minimum in my measurements) would be the most accurate way to go. Taking the average would end up in some weird point on the half-Gaussian slope.

Andrei

Reply via email to