I think ranges are a great thing, having simplicity as one of their advantages. In the beginning they were indeed rather simple:

struct MyRange {
    bool empty();
    ref Widget front();
    void popFront();
}

with the appropriate refinements for bidirectional and random.

Then there was the need to distinguish one-pass, "input" ranges (e.g. files) from many-pass, "forward" ranges. So the "save" function got born for forward ranges and above:

struct MyRange {
    bool empty();
    ref Widget front();
    void popFront();
    MyRange save();
}

Then @property came about which required extra adornments:

struct MyRange {
    @property bool empty();
    @property ref Widget front();
    void popFront();
    @property MyRange save();
}

Then some ranges were unable to return ref, but they could receive assignment. I call these /sealed ranges/ because they "seal" the address of their elements making it inaccessible:

struct MyRange {
    @property bool empty();
    @property Widget front();
    @property void front(Widget);
    void popFront();
    @property MyRange save();
}

This made bidirectional and random-access range interfaces quite big because now we're talking about back() (two versions), opIndex(), opIndexAssign() and opIndexOpAssign().

Until now I think we're solid on design decisions. The discussion starts here.

And then there was this nagging thing which is well-understood in the C++ world. A sealed range returns by value and accepts by value. But in C++, an object's copy constructor is arbitrarily expensive. The library must be designed in accordance to that and carefully navigate around that issue.

For example, swap is a fundamental primitive used in many algorithms. It should swap objects at constant cost. Indeed, for references, swap is easy to implement:

void swap(T)(ref T lhs, ref T rhs)
{
    assert(!pointsTo(lhs, rhs) && !pointsTo(rhs, lhs)
        && !pointsTo(lhs, lhs) && !pointsTo(rhs, rhs));
    T tmp = move(lhs);
    lhs = move(rhs);
    rhs = move(tmp);
}

or similar. However, a sealed range does not offer references, so trying e.g.

swap(r1.front, r2.front);

will not work. This is a problem.

To solve that problem, I introduced moveFront(), moveBack(), and moveAt(size_t), all of which destructively read the front, back, or an indexed element respectively off the range. Then you can swap r1.front with r2.front at constant cost like this:

    T tmp = r1.moveFront();
    r1.front = r2.moveFront();
    r2.front = move(tmp);

All of this works and is rock-solid, but it does load the range interface considerably. To a newcomer coming without the background above, a full-fledged range definition may look quite loaded.

One simplification is to simply decree that Phobos (and D in general) shuns objects with eager copy. Any this(this) could be considered constant cost. That would have two consequences:

1. It would simplify all ranges and many algorithms because there's no more need for moveXxx and swapping can be done the old-fashioned way (with assignments in inline code):

    T tmp = r1.front;
    r1.front = r2.front;
    r2.front = tmp;

In general many things become considerably easier if you can simply assume that copying objects around won't be part of your complexity requirements or the practical costs of your algorithm.

2. It would force certain types (such as BigInt) that allocate resources and have value semantics to resort to reference counting.

3. It would give more legitimacy to sealed objects that return data by value (as opposed to reference). I believe sealed objects are very important for safety.

4. It would be a definite departure from C++, where all value copies are considered of arbitrary cost. This would provide a convenient straw-man for naysayers (e.g. "Hey, D calls the copy constructor even more often than C++! No thanks, I'll stick with C++0x which solves it all with rvalue references").

5. It would make things look and feel familiar to people coming from any other languages than C++, who seldom need to define a costly this(this) at all.

Please discuss.


Andrei

Reply via email to