On Sunday, 29 May 2016 at 12:22:23 UTC, qznc wrote:
I played around with the benchmark. Some more numbers:

$ make ldc
ldmd2 -O -release -inline -noboundscheck *.d -ofbenchmark.ldc
./benchmark.ldc
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
    std find: 153 ±25    +66 (1934)  -15 (7860)
 manual find: 122 ±28    +80 (1812)  -17 (8134)
   qznc find: 125 ±16    +18 (4644)  -15 (5126)
  Chris find: 148 ±29    +75 (1976)  -18 (7915)
 Andrei find: 114 ±23   +100 (1191)  -13 (8770)
 (avg slowdown vs fastest; absolute deviation)
$ make dmd
dmd -O -release -inline -noboundscheck *.d -ofbenchmark.dmd
./benchmark.dmd
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
    std find: 160 ±27    +44 (3162)  -20 (6709)
 manual find: 148 ±28    +54 (2701)  -19 (7178)
   qznc find: 102 ±3     +27 ( 766)   -1 (9136)
  Chris find: 175 ±30    +55 (2796)  -21 (7106)
 Andrei find: 122 ±22    +46 (2554)  -14 (7351)
 (avg slowdown vs fastest; absolute deviation)

The additional numbers on the right are the ±MAD separated by above or below the mean. For example Andrei find with ldc:

  Andrei find: 114 ±23   +100 (1191)  -13 (8770)

The mean slowdown is 114, which means 14% slower than the fastest one. The mean absolute deviation (MAD) is 23. More precisely, the mean deviation above the mean slowdown of 103 is 100 and -13 below the mean slowdown. 1191 of the 10000 runs were above the mean slowdown and 8770 below. The 39 missing runs are equal to the mean slowdown.

What bothers me is that changing the alphabet changes the numbers so much. Currently, if you restrict the alphabets for haystack and needle, the numbers change. The benchmark already does a random subset on each run, but there is definitely a bias.

You can avoid "E: wrong result with Chris find" by using

outer:
    for (auto i = 0; i < haystack.length-needle.length; i++)
    {
        if (haystack[i] != needle[0])
            continue;
        for (size_t j = i+1, k = 1; k < needle.length; ++j, ++k)
            if (haystack[j] != needle[k])
                continue outer;
        return haystack[i..$];
    }

It's a tad faster.

I'm planning to test on more varied data and see where a bias may occur.

Reply via email to