Re: faster splitter

Chris via Digitalmars-d Sun, 29 May 2016 05:31:42 -0700

On Sunday, 29 May 2016 at 12:22:23 UTC, qznc wrote:

I played around with the benchmark. Some more numbers:
$ make ldc
ldmd2 -O -release -inline -noboundscheck *.d -ofbenchmark.ldc
./benchmark.ldc
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
    std find: 153 ±25    +66 (1934)  -15 (7860)
 manual find: 122 ±28    +80 (1812)  -17 (8134)
   qznc find: 125 ±16    +18 (4644)  -15 (5126)
  Chris find: 148 ±29    +75 (1976)  -18 (7915)
 Andrei find: 114 ±23   +100 (1191)  -13 (8770)
 (avg slowdown vs fastest; absolute deviation)
$ make dmd
dmd -O -release -inline -noboundscheck *.d -ofbenchmark.dmd
./benchmark.dmd
E: wrong result with Chris find
E: wrong result with Chris find
E: wrong result with Chris find
    std find: 160 ±27    +44 (3162)  -20 (6709)
 manual find: 148 ±28    +54 (2701)  -19 (7178)
   qznc find: 102 ±3     +27 ( 766)   -1 (9136)
  Chris find: 175 ±30    +55 (2796)  -21 (7106)
 Andrei find: 122 ±22    +46 (2554)  -14 (7351)
 (avg slowdown vs fastest; absolute deviation)
The additional numbers on the right are the ±MAD separated byabove or below the mean. For example Andrei find with ldc:
  Andrei find: 114 ±23   +100 (1191)  -13 (8770)
The mean slowdown is 114, which means 14% slower than thefastest one. The mean absolute deviation (MAD) is 23. Moreprecisely, the mean deviation above the mean slowdown of 103 is100 and -13 below the mean slowdown. 1191 of the 10000 runswere above the mean slowdown and 8770 below. The 39 missingruns are equal to the mean slowdown.
What bothers me is that changing the alphabet changes thenumbers so much. Currently, if you restrict the alphabets forhaystack and needle, the numbers change. The benchmark alreadydoes a random subset on each run, but there is definitely abias.


You can avoid "E: wrong result with Chris find" by using

outer:
    for (auto i = 0; i < haystack.length-needle.length; i++)
    {
        if (haystack[i] != needle[0])
            continue;
        for (size_t j = i+1, k = 1; k < needle.length; ++j, ++k)
            if (haystack[j] != needle[k])
                continue outer;
        return haystack[i..$];
    }

It's a tad faster.

I'm planning to test on more varied data and see where a bias mayoccur.

Re: faster splitter

Reply via email to