On Wednesday, 1 June 2016 at 14:32:38 UTC, Chris wrote:

For fun, I've written a search function that alternates between the beginning and the end of a string. I'm basically using Andrei's search algorithm for this. It is, of course only useful, if `needle` is expected to be either towards the beginning or the end of a string. If `needle` occurs multiple times, it matches where ever it encounters it first.

T[] findUnique(T)(T[] haystack, T[] needle)
{
        static size_t computeSkip(T[] needle)
        {
                size_t result = 1;
while (result < needle.length && needle[$ - 1 - result] != needle[$ - 1])
                {
                        ++result;
                }
                return result;
        }
        
        if (needle.length == 0) return haystack;
        immutable size_t len = haystack.length;
        immutable size_t lastIndex = needle.length - 1;
        auto last = needle[lastIndex];
        auto first = needle[0];
        size_t skip = 0;
for (size_t i = lastIndex, j = len-lastIndex-1; i < len; ++i, --j)
        {
                if (last != haystack[i])
                        goto back;
                for (size_t k = 0; ; ++k)
                {
                        if (k == lastIndex) return haystack[i-lastIndex..$];
                        if (haystack[i - lastIndex + k] != needle[k]) break;
                }
                back:
                if (i > j) break;
                if (first != haystack[j])
                        continue;
                for (size_t k = lastIndex; ; --k)
                {
                        if (k == 0) return haystack[j..$];
                        if (haystack[j + k] != needle[k]) break;
                }
                if (skip == 0) skip = computeSkip(needle);
                i += skip;
    j -= skip-1;
        }
        return haystack[$..$];
}

unittest
{
        string s1 = "Hello abc world!";
        assert (findUnique(s1, "abc") == "abc world!");
        assert (findUnique(s1, "def") == "");
        string s2 = "Ola, mundo!";
        assert (findUnique(s2, "do!") == "do!");
        string s3 = "abc";
        assert (findUnique(s3, "c") == "c");
        assert (findUnique(s3, "z") == "");
        string s4 = "aaabaaa";
        assert (findUnique(s4, "b") == "baaa");
}

I added it to the benchmarking suite (I had to turn off the error warnings, see description above). Naturally, it shines in the first test, as the needle is at the end of the text. The whole thing is to be taken with a grain of salt ;)

Search in Alice in Wonderland
       std: 535 ±8
    manual: 454 ±10
      qznc: 421 ±5
     Chris: 460 ±10
    Andrei: 643 ±19
   Andrei2: 314 ±6
   Andrei3: 352 ±5
    Biloop: 100 ±0
Search in random short strings
       std: 219 ±33
    manual: 194 ±47
      qznc: 123 ±11
     Chris: 219 ±69
    Andrei: 113 ±10
   Andrei2: 104 ±4
   Andrei3: 103 ±3
    Biloop: 195 ±46
Mismatch in random long strings
       std: 193 ±30
    manual: 249 ±103
      qznc: 156 ±30
     Chris: 260 ±108
    Andrei: 238 ±97
   Andrei2: 107 ±11
   Andrei3: 115 ±12
    Biloop: 141 ±44
Search random haystack with random needle
       std: 189 ±25
    manual: 193 ±57
      qznc: 152 ±26
     Chris: 203 ±59
    Andrei: 176 ±35
   Andrei2: 103 ±6
   Andrei3: 110 ±8
    Biloop: 141 ±28
 (avg slowdown vs fastest; absolute deviation)
CPU ID: GenuineIntel Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

Reply via email to