On Friday, 4 August 2017 at 16:57:37 UTC, bitwise wrote:
I'm confused about how D's lambda capture actually works, and can't find any clear specification on the issue. I've read the comments on the bug about what's described below, but I'm still confused. The conversation there dropped off in 2016, and the issue hasn't been fixed, despite high bug priority and plenty of votes.

How it works is described here [1] (and the GC involvement also listed here [2]), with the key sentences being

Delegates to non-static nested functions contain two pieces of data: the pointer to the stack frame of the lexically enclosing function (called the frame pointer) and the address of the function.

i.e. delegates point to the enclosing function's *stack frame* and access of its variables through that single pointer.

and

The stack variables referenced by a nested function are still valid even after the function exits (this is different from D 1.0). This is called a closure.

i.e. when you return a delegate to somewhere where the enclosing function's stack frame will have become invalid, D creates a (delegate) closure, copying the necessary frame pointed to by the delegate's frame pointer to the GC managed heap.


Consider this code:

void foo() {
    void delegate()[] funs;

    foreach(i; 0..5)
        funs ~= (){ writeln(i); };

    foreach(fun; funs)
        fun();
}

void bar() {
    void delegate()[] funs;

    foreach(i; 0..5)
    {
        int j = i;
        funs ~= (){ writeln(j); };
    }
    foreach(fun; funs)
        fun();
}


void delegate() baz() {
    int i = 1234;
    return (){ writeln(i); };
}

void overwrite() {
    int i = 5;
    writeln(i);
}

int main(string[] argv)
{
    foo();
    bar();

    auto fn = baz();
    overwrite();
    fn();

    return 0;
}

First, I run `foo`. The output is "4 4 4 4 4".
So I guess `i` is captured by reference, and the second loop in `foo` works because the stack hasn't unwound, and `i` hasn't been overwritten, and `i` contains the last value that was assigned to it.

`i` is accessed by each of the four delegates through their respective frame pointer, which (for all of them) points to foo's stack frame, where the value of `i` is 4 after the loop terminates.


Next I run `bar`. I get the same output of "4 4 4 4 4". While this hack works in C#, I suppose it's reasonable to assume the D compiler would just reuse stack space for `j`, and that the C# compiler has some special logic built in to handle this.

Yes, `j` exists once in foo's stack frame, so the same thing as in the above happens, because `j`'s value after the loop's termination is also 4.


Now, I test my conclusions above, and run `baz`, `overwrite` and `fn`. The result? total confusion. The output is "5" then "1234". So if the lambdas are referencing the stack, why wasn't 1234 overwritten?

This works as per spec:
Invoking baz() creates a delegate pointing to baz's stack frame and when you return it, that frame is copied to the GC managed heap by the runtime (because the delegate would have an invalid frame pointer otherwise). overwrite is a normal function with its own stack frame, which is used in its call to writeln. It does not interfact with baz, or the delegate returned by baz, in any way.

[...]

[1] https://dlang.org/spec/function.html#closures
[2] https://dlang.org/spec/garbage.html#op_involving_gc

Reply via email to