Unions destructors and GC precision

bearophile Tue, 14 Aug 2012 12:30:48 -0700

Before C++11 you weren't allowed to write something like:


union U {
    int x;
    std::vector<int> v;
} myu;


because v has an elaborate destructor.

In C++11 they have added "Unrestricted unions", already presentin g++ since version 4.6:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf

D2 doesn't give you that restriction, and when an union goes outof scope it calls the destructors of all its fields:



import std.stdio;

struct Foo1 {
    ~this() { writeln("Foo1.dtor"); }
}
struct Foo2 {
    ~this() { writeln("Foo2.dtor"); }
}
struct Foo3 {
    ~this() { writeln("Foo3.dtor"); }
}

union U {
    Foo1 f1;
    Foo2 f2;
    Foo3 f3;
}

void main() {
    U u;
}


Output:

Foo3.dtor
Foo2.dtor
Foo1.dtor


It looks cute, but I think that's wrong, and it causes problems.

This program crashes, because only u.f.__dtor() should be called,because a union is just one of its fields:



import std.stdio, core.stdc.stdlib;

struct Foo {
    int* p;

    ~this() {
        writeln("Foo.dtor");
        if (p) free(p);
    }
}

struct Bar {
    int* p;

    ~this() {
        writeln("Bar.dtor");
        if (p) free(p);
    }
}

union U {
    Foo f;
    Bar b;
    ~this() { writeln("U.dtor"); }
}

void main() {
    U u;
    u.f.p = cast(int*)malloc(10 * int.sizeof);
}

(This code can be fixed adding a "p = null;" after the "if (p)"line in both Foo and Bar, but this is beside the point, becauseit means fixing the problem at the wrong level. What if I can'tmodify the source code of Foo and Bar?).

The compiler in general can't know what dtor field to call, C++11"solves" this problem looking at the union, if one of its fieldshas a destructor, it disables the automatic creation of theconstructor, destructor, copy and assignment methods of theunion. So you have to write those methods manually.

Why D isn't doing the same? It seems a simple idea. With thatidea you are forced to write a destructor and opAssign (but onlyif one or more fields of the union has a destructor. If all unionfields are simple like an int or float, then the compiler doesn'task you to write the union dtor):



import std.stdio, core.stdc.stdlib;

struct Foo {
    int* p;

    ~this() {
        writeln("Foo.dtor");
        if (p) free(p);
    }
}

struct Bar {
    int* p;

    ~this() {
        writeln("Bar.dtor");
        if (p) free(p);
    }
}

struct Spam {
    bool isBar;

    union {
        Foo f;
        Bar b;

        ~this() {
            writeln("U.dtor ", isBar);
            if (isBar)
                b.__dtor();
            else
                f.__dtor();
        }
    }
}

void main() {
    Spam s;
    s.f.p = cast(int*)malloc(10 * int.sizeof);
}

If you don't have a easy to reach tag like isBar, then thingsbecome less easy. Probably you have to call b.__dtor() orf.__dtor() manually:



import std.stdio, core.stdc.stdlib;

struct Foo {
    int* p;

    ~this() {
        writeln("Foo.dtor");
        if (p) free(p);
    }
}

struct Bar {
    int* p;

    ~this() {
        writeln("Bar.dtor");
        if (p) free(p);
    }
}

struct Spam {
    bool isBar;

    union {
        Foo f;
        Bar b;

        ~this() {} // empty
    }
}

void main() {
    Spam s;
    s.f.p = cast(int*)malloc(10 * int.sizeof);
    scope(exit) s.f.__dtor();
}

------------------------------

A related problem with unions is the GC precision. We want a moreprecise GC, but unions reduce the precision.

To face this problem time ago I have suggested to add standardmethod named onMark() that is called at run-time by the GC. Itreturns the positional number of the union field currentlyactive. This means during the mark phase of the GC it callsonMark of the union, in this example the union has just the f andb fields, so the onMark has to return just 0 or 1:



class Spam {
    bool isBar;

    union {
        Foo f;
        Bar b;

        ~this() {
            writeln("U.dtor ", isBar);
            if (isBar)
                b.__dtor();
            else
                f.__dtor();
        }

        size_t onMark() {
            return isBar ? 1 : 0;
        }
    }
}

onMark() is required only if the union contains one or morefields that contain pointers.

I don't know if this idea is good enough (where to store the markbits?).

Again, if a nice isBar tag is not easy to reach, things becomemore complex.


-------------------------------

Maybe there is a way to merge the two solutions, creatingsomething simpler. In this design instead of onMark it's requireda method like activeField() that at runtime tells what's thefield currently "active" of the union. This method is called byboth the GC at runtime and when the union goes out of scope toknow what field destructor to call:



struct Spam {
    bool isBar;

    union {
        Foo f;
        Bar b;

        size_t activeField(size_t delegate() callMe=null) {
            return isBar ? 1 : 0;
        }
    }
}

So with activeField there is no need to define the ctor of theunion.

Again there is a problem when a nice tag like isBar (orequivalent information) is not easy to reach.


Bye,
bearophile

Unions destructors and GC precision

Reply via email to