Andrei Alexandrescu wrote:
This is somewhat OT but I think it's an interesting problem. Consider the following data:

double[][] a =
[
    [ 1, 4, 7, 8 ],
    [ 1, 7 ],
    [ 1, 7, 8],
    [ 4 ],
    [ 7 ],
];

We want to compute an n-way union, i.e., efficiently span all elements in all arrays in a, in sorted order. You can assume that each individual array in a is sorted. The output of n-way union should be:

auto witness = [
    1, 1, 1, 4, 4, 7, 7, 7, 7, 8, 8
];
assert(equal(nWayUnion(a), witness[]));

The STL and std.algorithm have set_union that does that for two sets: for example, set_union(a[0], a[1]) outputs [ 1, 1, 4, 7, 7, 8 ]. But n-way unions poses additional challenges. What would be a fast algorithm? (Imagine a could be a very large range of ranges).

It seems like there are two basic options: either merge the arrays (as
per merge sort) and deal with multiple passes across the same data
elements or insert the elements from each array into a single
destination array and deal with a bunch of memmove operations.

The merge option is kind of interesting because it could benefit from a
parallel range of sorts.  Each front() op could actually return a range
which contained the front element of each non-empty range. Pass this to a min() op accepting a range and drop the min element into the destination. The tricky part would be working things in such a way that the originating range could have popFront() called once the insertion had occurred.

Needless to say, nWayUnion is a range :o).

Finally, why would anyone care for something like this?

Other than mergeSort? I'd think that being able to perform union of N sets in unison would be a nice way to eliminate arbitrary restrictions.

Reply via email to