Re: Replacing tango.text.Ascii.isearch

rassoc via Digitalmars-d-learn Thu, 06 Oct 2022 14:39:08 -0700

On 10/5/22 23:50, torhu via Digitalmars-d-learn wrote:

I did some basic testing, and regex was two orders of magnitude faster. So now 
I know, I guess.


And what kind of testing was that? Mind to share? Because I did the following real quick 
and wasn't able to measure a "two orders of magnitude" difference. Sure, the 
regex version came on top, but they were both faster than the ruby baseline I cooked up.

First, generate a word file with 100k entries of various lengths:

$> dmd -run words.d foobaz 100000
---
import std;

string randomWord(ulong n) {
    static chars = letters.array;
    return generate!(() => chars.choice).take(n).text;
}

void main(string[] args) {
    enforce(args.length == 3, "Usage: dmd -run words.d needle num");

    auto f = File("words.txt", "w");
    foreach (i; 0..args[2].to!ulong) {
        ulong n = uniform(0, 50), m = uniform(0, 50);
        if (i % 2 == 0)
            f.writeln(randomWord(n), args[1], randomWord(m));
        else
            f.writeln(randomWord(n + m));
    }
}
---

And then for the actual measuring:

$> dmd -O -version={range,regex} -of=search-{range,regex} search.d
$> ldc -O -d-version={range,regex} -of=search-{range,regex}-ldc search.d
$> time ./search-{range,regex}{,-ldc} foobaz
---
import std;

void main(string[] args) {
    enforce(args.length == 2, "Usage: search 'needle'");

    version (regex)
        auto rx = regex(args[1], "i");
    else version (range)
        auto needle = args[1].asLowerCase.text;
    else
        static assert(0, "use -version={regex,range}");

    ulong matches;
    foreach (line; File("words.txt").byLine) {
        version (regex)
            if (line.matchFirst(rx))
                matches++;
        version (range)
            if (line.asLowerCase.canFind(needle))
                matches++;
    }
    writeln(matches);
}
---

Re: Replacing tango.text.Ascii.isearch

Reply via email to