On Sunday, 29 May 2016 at 12:22:23 UTC, qznc wrote:
I played around with the benchmark. Some more numbers:
$ make ldc
ldmd2 -O -release -inline -noboundscheck *.d -ofbenchmark.ldc
./benchmark.ldc
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
std find: 153 ±25 +66 (1934) -15 (7860)
manual find: 122 ±28 +80 (1812) -17 (8134)
qznc find: 125 ±16 +18 (4644) -15 (5126)
Chris find: 148 ±29 +75 (1976) -18 (7915)
Andrei find: 114 ±23 +100 (1191) -13 (8770)
(avg slowdown vs fastest; absolute deviation)
$ make dmd
dmd -O -release -inline -noboundscheck *.d -ofbenchmark.dmd
./benchmark.dmd
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
std find: 160 ±27 +44 (3162) -20 (6709)
manual find: 148 ±28 +54 (2701) -19 (7178)
qznc find: 102 ±3 +27 ( 766) -1 (9136)
Chris find: 175 ±30 +55 (2796) -21 (7106)
Andrei find: 122 ±22 +46 (2554) -14 (7351)
(avg slowdown vs fastest; absolute deviation)
The additional numbers on the right are the ±MAD separated by
above or below the mean. For example Andrei find with ldc:
Andrei find: 114 ±23 +100 (1191) -13 (8770)
The mean slowdown is 114, which means 14% slower than the
fastest one. The mean absolute deviation (MAD) is 23. More
precisely, the mean deviation above the mean slowdown of 103 is
100 and -13 below the mean slowdown. 1191 of the 10000 runs
were above the mean slowdown and 8770 below. The 39 missing
runs are equal to the mean slowdown.
What bothers me is that changing the alphabet changes the
numbers so much. Currently, if you restrict the alphabets for
haystack and needle, the numbers change. The benchmark already
does a random subset on each run, but there is definitely a
bias.
You can avoid "E: wrong result with Chris find" by using
outer:
for (auto i = 0; i < haystack.length-needle.length; i++)
{
if (haystack[i] != needle[0])
continue;
for (size_t j = i+1, k = 1; k < needle.length; ++j, ++k)
if (haystack[j] != needle[k])
continue outer;
return haystack[i..$];
}
It's a tad faster.
I'm planning to test on more varied data and see where a bias may
occur.