Re: Short list with things to finish for D2

Andrei Alexandrescu Thu, 19 Nov 2009 08:00:37 -0800

Steven Schveighoffer wrote:

On Wed, 18 Nov 2009 18:14:08 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:
We're entering the finale of D2 and I want to keep a short list ofthings that must be done and integrated in the release. It is clearlyunderstood by all of us that there are many things that could andprobably should be done.
1. Currently Walter and Don are diligently fixing the problems markedon the current manuscript.
2. User-defined operators must be revamped. Fortunately Don alreadyput in an important piece of functionality (opDollar). What we'relooking at is a two-pronged attack motivated by Don's proposal:
http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7

The two prongs are:
* Encode operators by compile-time strings. For example, instead ofthe plethora of opAdd, opMul, ..., we'd have this:
T opBinary(string op)(T rhs) { ... }
The string is "+", "*", etc. We need to design what happens withread-modify-write operators like "+=" (should they be dispatch to adifferent function? etc.) and also what happens with index-and-modifyoperators like "[]=", "[]+=" etc. Should we go with proxies? Absorbthem in opBinary? Define another dedicated method? etc.
I don't like this. The only useful thing I can see is if you wanted towrite less code to do an operation on a wrapper aggregate, such as anarray, where you could define all binary operations with a single mixin.
Other than that, it munges together all binary operations into a singlefunction, when all those operations are different it:
1) prevents code separation from things that are considered separately

(I'll retort inline for each point.) That's quite exactly the oppositeof what my experience with C++ and D operator overloading suggests: mostof the time (a) I need to overload operators in large groups, (b) I needto do virtually the same actions for each operator in a group.

Note that with opBinary you have unprecedented flexibility on how youwant to group operators. Consider:


struct A {
    A opBinary(string op)(A rhs)
        if (op == "+" || op == "-" || op == "*" || op == "/" ||
            op == "^^")
    {
        ...
    }
    A opBinary(string op)(A rhs) if (op == "~")
    {
        ...
    }
    ...
}

So anyway I contend that your argument is not correct. The "if" clauseallows you to separate code for things that are considered separately.So essentially you can do things with one function per operator if youso wanted. Correct?

2) makes operators non-virtual, which can be solved by a thunk, but thatseems like a lot of boilerplate code that will just cause bloat

Bloat of source or bloat of binary code? I don't know about the latter,but the former is actually nothing to worry about - it's easier todefine an interface or a mixin to convert from the proposed approach tothe old approach, than vice versa.

3) If you derive from a class that implements an operator, and you wantto make that operator virtual, it will be impossible

It means that base class didn't mean for that function to make theoperator overridable. If they wanted to make it configurable, they wouldhave forwarded the operator to a virtual function.

4) auto-generated documentation is going to *really* suck


Agreed.

5) you can't define operators on interfaces, or if you do, it looksridiculous (a thunk function that dispatches to the virtual methods).


interface Ridiculous {
    // Final functions in interfaces are allowed per TDPL
    Ridiculous opBinary(string op)(Ridiculous rhs) {
        return opAdd(rhs);
    }
    // Implement this
    Ridiculous opAdd(Ridiculous);
}

You can group things as you wish and combine virtual calls with stringcomparisons if that helps:


interface Ridiculous {
    // Final functions in interfaces are allowed per TDPL
    Ridiculous opArith(string op)(Ridiculous rhs)
        if (op == "+" || op == "-" || op == "*" || op == "/" ||
            op == "^^")
    {
        return opArith(op, rhs);
    }
    // Implement this
    Ridiculous opArith(string, Ridiculous);
}

6) implementing a new operator in a derived class is virtuallyimpossible (no pun intended).


class Base {
    Base opBinary(string op)(Base rhs) if (op == "+") {
        ...
    }
}

class Derived : Base {
    Derived opBinary(string op)(Derived rhs) if (op == "-") {
        ...
    }
}

When you do so, you retain the advantage of grouping operators together(I think it's most likely that Base defines operators of one kind e.g.arithmetic and Derived defines operators of a different kind e.g. logicor catenation). Add thunking as you need and you're good to go.

I imagine that dcollections for example will be *very* hard to writewith this change.


I hope my arguments above convinced you to the contrary.

Seems like you are trying to solve a very focused problem withoutlooking at the new problems your solution will cause outside that domain.

You are correct in that I'm trying to smooth things primarily forstructs. But I'll say that the templated approach is no slouch and canaccommodate classes with virtual functions very capably, even though itis a bit more work than before.

One question is whether it's more often to overload operators forstructs vs. classes. I imagine dcollections defines catenation andslicing, but not the bulk of operators. But the vast majority ofoperator overloading application is with value types as far as I can tell.

Can we do something like how opApply/ranges resolves? I.e. the compilertries doing opAdd or opMul or whatever, and if that doesn't exist, tryopBinary("+").


I wouldn't want to have too many layers that do essentially the same thing.

3. It was mentioned in this group that if getopt() does not work inSafeD, then SafeD may as well pack and go home. I agree. We need tomake it work. Three ideas discussed with Walter:
* Allow taking addresses of locals, but in that case switch allocationfrom stack to heap, just like with delegates. If we only do that inSafeD, behavior will be different than with regular D. In any case,it's an inefficient proposition, particularly for getopt() whichactually does not need to escape the addresses - just fills them up.
Perhaps, but getopt is probably not the poster child for optimizingperformance -- you most likely call it once, changing that singleapplication to use heap data isn't going to make a difference.


I agree. My fear is that getopt is only an example of a class of functions.

* Allow @trusted (and maybe even @safe) functions to receive addressesof locals. Statically check that they never escape an address of aparameter. I think this is very interesting because it enlarges thecommon ground of D and SafeD.
I think allowing calling @trusted or @safe functions with addresses tolocals is no good for @safe functions (i.e. a @safe function calls a@trusted function with an address to a local without heap-allocating).Remember the "returning a parameter array" problem...

I've been thinking more of examples where you pass a pointer to a@trusted or @safe function and that function escapes the pointer. Icouldn't find an example. So maybe allowing that is a good solution.


How would returning a parameter array break things?

* Figure out a way to reconcile "ref" with variadics. This is theactual reason why getopt chose to traffic in addresses, and fixing itis the logical choice and my personal favorite.
This sounds like the best choice.

Well it's not that simple. As I explained in a different post, getopttakes (string, pointer, string, pointer, string, pointer, ...). Now weneed to make it take references instead of pointers, but the stringsshould stay values. We can't express a checkered constraint like that.

Incidentally there's a theory for allowing that, it's called "regulartypes" inspired from regular grammars. With a regular type you candefine getopt signature as one or more pairs of string and ref.(Unfortunately C++ defined regular types differently which makes thingsdifficult to search.) Anyhow, I don't think such an approach would helpD - it's too complicated.

6. There must be many things I forgot to mention, or that cause griefto many of us. Please add to/comment on this list.
I know it's not part of the spec, but I'm not sure if you mention thearray "data stomping" problem in the book. If not, the MRU cache needsto be implemented.

Yes, it will be because the book has a few failing unittests. In fact, Iwas hoping I could talk you or David into doing it :o).



Andrei

Re: Short list with things to finish for D2

Reply via email to