Compile-time optimization

JS Tue, 23 Jul 2013 09:26:40 -0700

There seems to be a lot of improvement for programming languagesto optimize compile time aspects that are not taken into account.With ctfe I think such optimizations are more and more relevantin D.


I'll give a simple example:


Take a standard string joining function:

string join(string[] s)
{
    string ss;
    foreach(sss; s) ss ~= sss;
    return ss
}

when this string function is called at run-time with literalarguments it is extremely inefficient:


join(["a", "b", "c"]);

because the result can obviously be computed at compile-time bythe compiler.

using ctfe can solve this problem but only when used in constresults.

But there should be no reason why join(["a", "b", "c"]) can't beoptimized at compile time so that the function call is replacedwith "abc".


Now, think of join as defined like

string join(T...)(T t);

and we call it like

join("a", s, "b", "c", ss);

where s is string that is compile time evaluable and ss is arun-time string array.


the optimal join code would be something like

r = "a"~s~"bc"
foreach(sss; ss) r ~= sss;

(and which would change depending on the argument types and ifthey are compile time evaluable)

To generate such "optimal" code(one that gets the compiler to doas much work as it can at compile time) is somewhat convoluted inD. We can use string mixins to solve the problem but not easilybecause we can't pass variadic arguments to templates.


e.g.,

join(T...)(T s)
{
    mixin(tJoinH!(s))
}

fails because s is not known at compile time... even if s ispartially known.

It would be nice if we could pass the arguments to a templateeven if they are not completely known at compile time.


e.g., join("a", s),

tJoinH can receive "a" and s but s's value, if not compile-timeknown, is undefined.

Hence, tJoinH can optimize the arguments by attempting to jointhe compile-time arguments and insert optimal code for therun-time arguments.

That is, if we call a function with an argument list, we shouldbe able to pass that to a template without any errors and be ableto determine the value of each argument... if the argument isundefined, then that becomes it's value.


Another way to see it work is with sum:

sum(1,2,x);

int sum(T...)(T i)
{

// return 1 + 2 + i[3]; for the call above, it is optimal andour goal to generate// return mixin(tSumH!(i)); // hypothetically generates 1 + 2+ i[3] for the call above// return mixin(tSumH2!(i)); // generates i[1] + i[2] + i[3].Current possible way

// naive approach(assumes all arguments are known only atrun-time):

    int y;
    foreach(ii; i) y += ii;
    return y;
}

I'm able to achieve this partially by passing T, rather than i,to the template function and the variable name, i. This works butmay not produce optimal results(referencing the stack instead ofthe value of a literal... which prevents it from being furtheroptimized).

e.g., from above, instead of return 1 + 2 + i[3]; I wouldgenerate return i[1] + i[2] + i[3]; Not optimal because i[1] andi[2] are treated as run-time entities even though in the abovecase they are not.

If D had some model to create functions that can optimize the waythey deal with their compile-time arguments and run-timearguments then it would be pretty easy to create such functions.


e.g.,

string join(T...)(T t)    // Note join has a variadic parameter
{
    ctfe {

// called when join is used in ctfe, t can be used toget the values directly if they exist at compile time, else theirvalue is undefined.

         // can call join here with run-time entities

    }

    // runtime version goes here
}

we might have something like this

string join(T...)(T t) // easy notation to constrain T on char,string, string[]

{
    ctfe
    {
        string code = "string s;";
        foreach(k, val; t)
        {
            if (isDefined!val)

code ~= `s ~= "`~join(val)~`";`; // calls join asa ctfe on val.

            else
            {
                if (type == string[])
                    code ~= `foreach(ss; t[`~k~`]) s~= sss;`;
                } else code ~= `s ~= t[`~k~`];`;
            }
        }
        code ~= "return s;";
        mixin code;
    }

// concatenates all arguments of t as if they were run-timearguments.

    string s;
    foreach(k, ss; t)
        static if (T[k].type == string[])
            foreach(sss; ss)
                s ~= sss;
        else
            s ~= ss;
    return s;
}

I'm not saying this works but the idea is that the ctfe block isexecuted at compile time WHEN there are non-compile timearguments(ok, maybe ctfe isn't the best keyword). The block thenfigures out how to best deal with the non-ct arguments.

In the above code note how join is called in the ctfe block tohandle known compile time types. This is just standard ctfeexecution. The real "magic" happens when it is dealing withnon-ct arguments, in which case it builds up meta code which isused as a mixin(special statement at end of ctfe block) that isinserted.


So for join("a", s); the code would execute something like

at compile time(in ctfe block)
    code = `string s;`;
    "a" is defined so
    code ~= `s ~= "~join("a")~`";`;

join("a") is called, since "a" is known, the non-ctfeblock is called, so we actually have

    code ~= `s ~= "a";`;
 next for s(which is undefined at compile time)
    code ~= `s ~= t[1];`;
 end
    code ~= "return s;";
so the string mixin code looks like:

    string s;
    s ~= "a";
    s ~= t[1];
    return s;

or simplified,

   return "a"~t[1];

which is an optimal join code for the join call join("a", s);

(it would make more sense to call a template in the ctfe block sowe don't have form code strings of code strings)

Note also that having such a ctfe block does not break existingcode structures and pre existing functions can easily be upgradedfor optimal behavior.

Having the compiler do as much optimization as possible shouldproduce a significant speedup in most applications.

When a function is called with any known compile time entities,it is possible for an optimization to take place. I think if Dincorporates some mechanism in it's language to do so it will goa long.

In fact, all this simply demonstrates that ctfe's are not aspowerful as they can be. Calling a ctfe'able function with justone run-time variable will prevent it from being ctfe'ed andoptimized completely... even though there is still somepossibility for optimization.

Hopefully some will read this acutely and attempt to ponder itspotential. I expect the immediate naysayers who can't see pasttheir own noses... and those that get hung up on non-essentialdetails(I said that the block doesn't have to be called ctfe...oh hell, you do realize we don't even need to use a block?).

Compile-time optimization

Reply via email to