Georg Wrede wrote:
Andrei Alexandrescu wrote:
This is somewhat OT but I think it's an interesting problem. Consider the following data:

double[][] a =
[
    [ 1, 4, 7, 8 ],
    [ 1, 7 ],
    [ 1, 7, 8],
    [ 4 ],
    [ 7 ],
];

We want to compute an n-way union, i.e., efficiently span all elements in all arrays in a, in sorted order. You can assume that each individual array in a is sorted. The output of n-way union should be:

auto witness = [
    1, 1, 1, 4, 4, 7, 7, 7, 7, 8, 8
];
assert(equal(nWayUnion(a), witness[]));

The STL and std.algorithm have set_union that does that for two sets: for example, set_union(a[0], a[1]) outputs [ 1, 1, 4, 7, 7, 8 ]. But n-way unions poses additional challenges. What would be a fast algorithm? (Imagine a could be a very large range of ranges).

Needless to say, nWayUnion is a range :o).

If we'd know anything about the data, such as, the max value is always smaller than the total number of elements in the subarrays, then we'd probably more easily invent a decent algorithm.

But the totally general algorithm has to be more inefficient. And constructing (not worst-case, but) tough-case data is trivial. For example, take a thousand subarrays, each a thousand elements long, containing random uints from the inclusive range 0..uint.max.

You can assume that each array is sorted.

Andrei

Reply via email to