groupBy predicates: unary, binary, or both?

Andrei Alexandrescu Sun, 19 Jan 2014 14:27:41 -0800

I'm working onhttps://github.com/D-Programming-Language/phobos/pull/1186, which issomewhat important; "group by" is a powerful operation. Along the way, Istumbled upon an interesting issue in which I wanted to consult thecommunity.

Usually groupBy is used to find runs of equivalent elements in a range.For example:


[1, 1, 1, 2, 2, 3, 3, 3, 4, 5, 5, 6].groupBy()

yields the range [[1, 1, 1], [2, 2], [3, 3, 3], [4], [5, 5], [6]].

As is usual with Phobos algorithms, groupBy accepts a predicate. Thedefault (as illustrated above) is "a = b", i.e. all elements in a groupare equal to one another.

Equality is transitive and commutative. But there are useful cases inwhich predicates are not commutative. Consider we want to find strictlymonotonic subranges in the range. We'd write:


auto r = [1, 3, 2, 4, 5, 1].groupBy!"a < b";

That should produce [[1, 3], [2, 4, 5], [1]]. For non-strict monotonicruns, the predicate would be "a <= b" etc. All that is pretty awesome.

However, that makes life a bit tougher for the algorithm - it must onlycompare adjacent elements only. In the case of "a = b", it suffices tosave the first element in a group and compare it against every otherelement in the group.

Meanwhile, a very similar pull request(https://github.com/D-Programming-Language/phobos/pull/1453) uses unarypredicates, i.e. an optional transformation function that is then usedin conjunction with "==" to decide which elements belong in the same group.

Unary predicates make life simpler for the algorithm (save the transformof the first element, then compare it against the transform of the nextetc) and are often easier to write by the end user, too (e.g. just write"a.length" instead of "a.length == b.length" to group by length).

So I was thinking to allow both cases, with the understanding thatgrouping by unary predicates uses "==" for comparison whereas groupingby binary predicates looks at adjacent elements to figure out groupmembership. That approach would, however, preclude the use of stringlambdas (because deducing arity for string lambdas is possible, butquite unwieldy).


What do you think?


Andrei

groupBy predicates: unary, binary, or both?

Reply via email to