Awhile ago Kenji posted this excellent dip (http://wiki.dlang.org/DIP32) that aimed to improve tuple syntax, and described several cases in which tuples could be destructured. You can see his original thread here: http://forum.dlang.org/thread/mailman.372.1364547485.4724.digitalmar...@puremagic.com, and further discussion in this thread: http://forum.dlang.org/thread/dofwinzpbcdwkvhzc...@forum.dlang.org.

It seemed that there was a lot of interest in having syntax somewhat like what is described in Kenji's DIP, but it didn't really go anywhere. There is this pull on Github (https://github.com/D-Programming-Language/dmd/pull/341), but it uses the (a, b) syntax, which has too much overlap with other language constructs. Andrei/Walter didn't want to merge that pull request without a full consideration of the different design issues involved, which in retrospect was a good decision.

That said, I'd like to open the discussion on tuple syntax yet again. Tuples are currently sorely underused in D, due in large part to being difficult to understand and awkward to use. One large barrier to entry is that fact that D has not 1, not 2, but 3 different types of tuples (depending on how you look at it), which are difficult to keep straight.

There is std.typecons.Tuple, which is fundamentally different from std.typecons.TypeTuple in that it's implemented as a struct, while TypeTuple is just a template wrapped around the compiler tuple type. ExpressionTuples are really just TypeTuples that contain only values, and aren't mentioned anywhere except for in this article: http://dlang.org/tuple.html, which frankly creates more confusion than clarity.

A good, comprehensive design has the potential to make tuples easy to use and understand, and hopefully clear up the unpleasant situation we have currently. A summary of what has been discussed so far:

- (a, b) is the prettiest syntax, and it also completely infeasible

- {a, b} is not as pretty, but it's not that bad of an alternative (though it may still have issues as well)

- #(a, b) is unambiguous and would probably be the easiest option. I don't think it looks too bad, but some people might find it ugly and noisy

- How should tuples be expanded? There is the precedent of an expand() method of std.typecons.Tuple, but Kenji liked tup[] (slicing syntax). So with a tuple of #(1, "a", 0.0), tup[0..2] would be an expanded tuple containing 1 and "a". On the other hand, Bearophile and Timon Gehr preferred that slicing a tuple create another "closed" tuple, and to use expand() for expansion. So tup[] would create a copy of the tuple, and tup[0..2] would create a closed tuple eqvivalent to #(1, "a"). I don't have any particular preference in that regard.

- Timon Gehr wanted the ability to swap tuple values, so #(x, y) = #(y, x) would be allowed. Kenji was against it, saying that it would introduce too many complications.

- There was no consensus on the pattern matching syntax for unpacking. For example, #(a, _) = #(1, 2) only introduces one binding, "a", into the surrounding scope. The question is, what character should go in the place of "_" to signify that a value should not be bound? Some suggestions were #(a, $), #(a, @), #(a, ?). I personally think #(a, ?) or #(a, *) would be best, but all that's really necessary is a symbol that cannot also be an identifier.

Also up for debate was nested patterns, e.g., #(1, 2, #(3, 4, #(5, 6))). I don't think there was a consensus on unpacking and pattern matching for this situation. One idea I saw that looked good:

* Use "..." to pattern match on the tail of an expressions, so take the above tuple. The pattern #(1, ?, ...) would match the two nested sub-tuples. Or, say, #(1, 2, 3) could be matched by #(1, 2, 3), #(1, ?, 3), #(1, ...), etc. You obviously can't refer to "..." as a variable, so it also becomes a useful way of saying "don't care" for multiple items, e.g., #(a, ...) -> only bind the first item in the tuple. We can play around with this to get a few other useful constructs, such as #(a, ..., b) -> match first and last, #(..., b) -> match last, etc.

Assuming the "..." syntax for unpacking, it would be useful to name the captured tail. For example, you could unpack #(1, 3, #(4, 6)) into #(a, b, x...), where a = 1, b = 3, x = #(4, 6). Similarly, #(head, rest...) results in head = 1, rest = #(2, #(4, 6)). I think this would be very useful.

- Concatenating tuples with ~. This is nice to have, but not particularly important.

One thing that I think was overlooked, but would be pretty cool, is that a tuple unpacking/pattern matching syntax would allow us to unpack/pattern match just about anything that you can make a tuple of in D. Combine this with the .tupleof property, and things get interesting... Maybe. There is one possible problem: .tupleof returns a TypeTuple, and it's not at all clear to me how, if at all, TypeTuple would work with the proposed syntax. Is #(int, string, bool) a valid tuple instantiation? This is something that needs to be worked out.

This is the third or fourth time that I know of that tuple syntax has come up, and as of yet, nothing has been done about it. I'd really like to get the ball rolling on this, as I think a good syntax for these tuple operations would do D a world of good. I'm not a compiler hacker, unfortunately, so I can't implement it myself as proof of concept... However, I hope that discussing it and working out all the kinks will help pave the way for an actual implementation.

Reply via email to