On Wed, 21 Jul 2010 03:58:33 +0200, bearophile <bearophileh...@lycos.com> wrote:

Andrei Alexandrescu:

emplace(), defined in std.conv, is relatively new. I haven't yet added
emplace() for class objects, and this is as good an opportunity as any:
http://www.dsource.org/projects/phobos/changeset/1752

Thank you, I have used this, and later I have done few tests too.

The "scope" for class instantiations can be deprecated once there is an acceptable alternative. You can't deprecate features before you have found a good enough alternative.

---------------------

A first problem is the syntax, to allocate an object on the stack you need something like:

// is testbuf correctly aligned?
ubyte[__traits(classInstanceSize, Test)] testbuf = void;
Test t = emplace!(Test)(cast(void[])testbuf, arg1, arg2);


That is too much worse looking, hairy and error prone than:
scope Test t = new Test(arg1, arg2);


I have tried to build a helper to improve the situation, like something that looks:
Test t = StackAlloc!(Test, arg1, arg2);

But failing that, my second try was this, not good enough:
mixin(stackAlloc!(Test, Test)("t", "arg1, arg2"));

---------------------

A second problem is that this program compiles with no errors:

import std.conv: emplace;

final class Test {
    int x, y;
    this(int xx, int yy) {
        this.x = xx;
        this.y = yy;
    }
}

Test foo(int x, int y) {
    ubyte[__traits(classInstanceSize, Test)] testbuf = void;
    Test t = emplace!(Test)(cast(void[])testbuf, x, y);
    return t;
}

void main() {
    foo(1, 2);
}



While the following one gives:
test.d(13): Error: escaping reference to scope local t


import std.conv: emplace;

final class Test {
    int x, y;
    this(int xx, int yy) {
        this.x = xx;
        this.y = yy;
    }
}

Test foo(int x, int y) {
    scope t = new Test(x, y);
    return t;
}

void main() {
    foo(1, 2);
}


So the compiler is aware that the scoped object can't escape, while using emplace things become more bug-prone. "scope" can cause other bugs, time ago I have filed a bug report about one problem, but it avoids the most common bug. (I am not sure the emplace solves that problem with scope, I think it shares the same problem, plus adds new ones).

---------------------

A third problem is that the ctor doesn't get called:


import std.conv: emplace;
import std.c.stdio: puts;

final class Test {
    this() {
    }
    ~this() { puts("killed"); }
}

void main() {
    ubyte[__traits(classInstanceSize, Test)] testbuf = void;
    Test t = emplace!(Test)(cast(void[])testbuf);
}


That prints nothing. Using scope it gets called (even if it's not present!).

---------------------

This is not a problem of emplace(), it's a problem of the dmd optimizer.
I have done few tests for the performance too. I have used this basic pseudocode:

while (i < Max)
{
   create testObject(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   testObject.doSomething(i, i, i, i, i, i)
   destroy testObject
   i++
}


Coming from here:
http://www.drdobbs.com/java/184401976
And its old timings:
http://www.ddj.com/java/184401976?pgno=9


The Java version of the code is simple:

final class Obj {
    int i1, i2, i3, i4, i5, i6;

    Obj(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
        this.i1 = ii1;
        this.i2 = ii2;
        this.i3 = ii3;
        this.i4 = ii4;
        this.i5 = ii5;
        this.i6 = ii6;
    }

void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
    }
}

class Test {
    public static void main(String args[]) {
        final int N = 100_000_000;
        int i = 0;
        while (i < N) {
            Obj testObject = new Obj(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            testObject.doSomething(i, i, i, i, i, i);
            // testObject = null; // makes no difference
            i++;
        }
    }
}



This is a D version that uses emplace() (if you don't use emplace here the performance of the D code is very bad compared to the Java one):

// program #1
import std.conv: emplace;

final class Test { // 32 bytes each instance
    int i1, i2, i3, i4, i5, i6;
    this(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
        this.i1 = ii1;
        this.i2 = ii2;
        this.i3 = ii3;
        this.i4 = ii4;
        this.i5 = ii5;
        this.i6 = ii6;
    }
void doSomething(int ii1, int ii2, int ii3, int ii4, int ii5, int ii6) {
    }
}

void main() {
    enum int N = 100_000_000;

    int i;
    while (i < N) {
        ubyte[__traits(classInstanceSize, Test)] buf = void;
Test testObject = emplace!(Test)(cast(void[])buf, i, i, i, i, i, i);
        // Test testObject = new Test(i, i, i, i, i, i);
        // scope Test testObject = new Test(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject.doSomething(i, i, i, i, i, i);
        testObject = null;
        i++;
    }
}


The Java code (server) runs in about 0.25 seconds here.
The D code (that doesn't do heap allocations at all) run in about 3.60 seconds.

With a bit of experiments I have seen that emplace() doesn't get inlined, and the cause is it contains enforce(). enforce contains a throw, and it seems dmd doesn't inline functions that can throw, you can test it with a little test program like this:


import std.c.stdlib: atoi;
void foo(int b) {
    if (b)
        throw new Throwable(null);
}
void main() {
    int b = atoi("0");
    foo(b);
}


So if you comment out the two enforce() inside emplace() dmd inlines emplace() and the running time becomes about 2.30 seconds, less than ten times slower than Java.

If emplace() doesn't contain calls to enforce() then the loop in main() becomes (dmd 2.047, optmized build):


L1A:            push    dword ptr 02Ch[ESP]
                mov     EDX,_D10test6_good4Test7__ClassZ[0Ch]
                mov     EAX,_D10test6_good4Test7__ClassZ[08h]
                push    EDX
                push    ESI
                call    near ptr _memcpy
                mov     ECX,03Ch[ESP]
                mov     8[ECX],EBX
                mov     0Ch[ECX],EBX
                mov     010h[ECX],EBX
                mov     014h[ECX],EBX
                mov     018h[ECX],EBX
                mov     01Ch[ECX],EBX
                inc     EBX
                add     ESP,0Ch
                cmp     EBX,05F5E100h
                jb      L1A


(The memcpy is done by emplace to initialize the object before calling its ctor. You must perform the initialization because it needs the pointer to the virtual table and monitor. The monitor here was null. I think a future LDC2 can optimize away more stuff in that loop, so it's not so bad).


If you use this in program #1:
scope Test testObject = new Test(i, i, i, i, i, i);
It runs in about 6 seconds (also because the ctor is called even if's missing).

If in program #1 you use just new, without scope, the runtime is about 27.2 seconds, about 110 times slower than Java.

Bye,
bearophile

Takes 18m27.720s in PHP :)

Reply via email to