Reducing the cost of autodecoding

Andrei Alexandrescu via Digitalmars-d Wed, 12 Oct 2016 07:20:30 -0700

So we've had a good run with making popFront smaller. In ASCIImicrobenchmarks with ldc, the speed is indistinguishable from s = s[1 ..$]. Smaller functions make sure that the impact on instruction cache inlarger applications is not high.

Now it's time to look at the end-to-end cost of autodecoding. I wrotethis simple microbenchmark:


=====
import std.range;

alias myPopFront = std.range.popFront;
alias myFront = std.range.front;

void main(string[] args) {
    import std.algorithm, std.array, std.stdio;
    char[] line = "0123456789".dup.repeat(50_000_000).join;
    ulong checksum;
    if (args.length == 1)
    {
        while (line.length) {
            version(autodecode)
            {
                checksum += line.myFront;
                line.myPopFront;
            }
            else
            {
                checksum += line[0];
                line = line[1 .. $];
            }
        }
        version(autodecode)
            writeln("autodecode ", checksum);
        else
            writeln("bytes ", checksum);
    }
    else
        writeln("overhead");
}
=====

On my machine, with "ldc2 -release -O3 -enable-inlining" I get somethinglike 0.54s overhead, 0.81s with no autodecoding, and 1.12s withautodecoding.

Your mission, should you choose to accept it, is to define a combinationfront/popFront that reduces the gap.



Andrei

Reducing the cost of autodecoding

Reply via email to