There seems to be a lot of improvement for programming languages to optimize compile time aspects that are not taken into account. With ctfe I think such optimizations are more and more relevant in D.

I'll give a simple example:

Take a standard string joining function:

string join(string[] s)
{
    string ss;
    foreach(sss; s) ss ~= sss;
    return ss
}

when this string function is called at run-time with literal arguments it is extremely inefficient:

join(["a", "b", "c"]);

because the result can obviously be computed at compile-time by the compiler.

using ctfe can solve this problem but only when used in const results.

But there should be no reason why join(["a", "b", "c"]) can't be optimized at compile time so that the function call is replaced with "abc".

Now, think of join as defined like

string join(T...)(T t);

and we call it like

join("a", s, "b", "c", ss);

where s is string that is compile time evaluable and ss is a run-time string array.

the optimal join code would be something like

r = "a"~s~"bc"
foreach(sss; ss) r ~= sss;

(and which would change depending on the argument types and if they are compile time evaluable)


To generate such "optimal" code(one that gets the compiler to do as much work as it can at compile time) is somewhat convoluted in D. We can use string mixins to solve the problem but not easily because we can't pass variadic arguments to templates.

e.g.,

join(T...)(T s)
{
    mixin(tJoinH!(s))
}

fails because s is not known at compile time... even if s is partially known.

It would be nice if we could pass the arguments to a template even if they are not completely known at compile time.

e.g., join("a", s),

tJoinH can receive "a" and s but s's value, if not compile-time known, is undefined.

Hence, tJoinH can optimize the arguments by attempting to join the compile-time arguments and insert optimal code for the run-time arguments.

That is, if we call a function with an argument list, we should be able to pass that to a template without any errors and be able to determine the value of each argument... if the argument is undefined, then that becomes it's value.

Another way to see it work is with sum:

sum(1,2,x);

int sum(T...)(T i)
{
// return 1 + 2 + i[3]; for the call above, it is optimal and our goal to generate // return mixin(tSumH!(i)); // hypothetically generates 1 + 2 + i[3] for the call above // return mixin(tSumH2!(i)); // generates i[1] + i[2] + i[3]. Current possible way

// naive approach(assumes all arguments are known only at run-time):
    int y;
    foreach(ii; i) y += ii;
    return y;
}


I'm able to achieve this partially by passing T, rather than i, to the template function and the variable name, i. This works but may not produce optimal results(referencing the stack instead of the value of a literal... which prevents it from being further optimized).

e.g., from above, instead of return 1 + 2 + i[3]; I would generate return i[1] + i[2] + i[3]; Not optimal because i[1] and i[2] are treated as run-time entities even though in the above case they are not.


If D had some model to create functions that can optimize the way they deal with their compile-time arguments and run-time arguments then it would be pretty easy to create such functions.

e.g.,

string join(T...)(T t)    // Note join has a variadic parameter
{
    ctfe {
// called when join is used in ctfe, t can be used to get the values directly if they exist at compile time, else their value is undefined.
         // can call join here with run-time entities

    }

    // runtime version goes here
}

we might have something like this

string join(T...)(T t) // easy notation to constrain T on char, string, string[]
{
    ctfe
    {
        string code = "string s;";
        foreach(k, val; t)
        {
            if (isDefined!val)
code ~= `s ~= "`~join(val)~`";`; // calls join as a ctfe on val.
            else
            {
                if (type == string[])
                    code ~= `foreach(ss; t[`~k~`]) s~= sss;`;
                } else code ~= `s ~= t[`~k~`];`;
            }
        }
        code ~= "return s;";
        mixin code;
    }

// concatenates all arguments of t as if they were run-time arguments.
    string s;
    foreach(k, ss; t)
        static if (T[k].type == string[])
            foreach(sss; ss)
                s ~= sss;
        else
            s ~= ss;
    return s;
}


I'm not saying this works but the idea is that the ctfe block is executed at compile time WHEN there are non-compile time arguments(ok, maybe ctfe isn't the best keyword). The block then figures out how to best deal with the non-ct arguments.

In the above code note how join is called in the ctfe block to handle known compile time types. This is just standard ctfe execution. The real "magic" happens when it is dealing with non-ct arguments, in which case it builds up meta code which is used as a mixin(special statement at end of ctfe block) that is inserted.

So for join("a", s); the code would execute something like

at compile time(in ctfe block)
    code = `string s;`;
    "a" is defined so
    code ~= `s ~= "~join("a")~`";`;
join("a") is called, since "a" is known, the non-ctfe block is called, so we actually have
    code ~= `s ~= "a";`;
 next for s(which is undefined at compile time)
    code ~= `s ~= t[1];`;
 end
    code ~= "return s;";
so the string mixin code looks like:

    string s;
    s ~= "a";
    s ~= t[1];
    return s;

or simplified,

   return "a"~t[1];

which is an optimal join code for the join call join("a", s);

(it would make more sense to call a template in the ctfe block so we don't have form code strings of code strings)


Note also that having such a ctfe block does not break existing code structures and pre existing functions can easily be upgraded for optimal behavior.


Having the compiler do as much optimization as possible should produce a significant speedup in most applications.

When a function is called with any known compile time entities, it is possible for an optimization to take place. I think if D incorporates some mechanism in it's language to do so it will go a long.

In fact, all this simply demonstrates that ctfe's are not as powerful as they can be. Calling a ctfe'able function with just one run-time variable will prevent it from being ctfe'ed and optimized completely... even though there is still some possibility for optimization.


Hopefully some will read this acutely and attempt to ponder its potential. I expect the immediate naysayers who can't see past their own noses... and those that get hung up on non-essential details(I said that the block doesn't have to be called ctfe... oh hell, you do realize we don't even need to use a block?).




Reply via email to