Re: [OT] Algorithm question

Ivan Kazmenko via Digitalmars-d Mon, 01 May 2017 07:41:54 -0700

On Monday, 1 May 2017 at 04:15:35 UTC, H. S. Teoh wrote:

Given a set A of n elements (let's say it's a random-accessrange of
size n, where n is relatively large), and a predicate P(x) that
specifies some subset of A of elements that we're interestedin, what'sthe best algorithm (in terms of big-O time complexity) forselecting arandom element x satisfying P(x), such that elements thatsatisfy P(x)have equal probability of being chosen? (Elements that do notsatisfy
P(x) are disregarded.)

I'd like to note here that, if you make use of the same P(x) manytimes (instead of different predicates on each call), it makessense to spend O(n) time and memory filtering by that predicateand storing the result, and then answer each query in O(1).

3) Permutation walk:

        auto r = ... /* elements of A */
        foreach (i; iota(0 .. r.length).randomPermutation) {
                if (P(r[i])) return r[i];
        }
        /* no elements satisfy P(x) */
Advantages: if an element that satisfies P(x) is foundearly, theloop will terminate before n iterations. This seems like thebest of
   both worlds of (1) and (2), except:
Disadvantages: AFAIK, generating a random permutation ofindices from0 .. n requires at least O(n) time, so any advantage we mayhave had
   seems to be negated.
Is there an algorithm for *incrementally* generating a randompermutation of indices? If there is, we could use that in (3)and thus achieve the best of both worlds between earlytermination if an element satisfying P(x) is found, andguaranteeing termination after n iterations if no elementssatisfying P(x) exists.


Yes, there is.

There are actually two variations of Fisher-Yates shuffle:
(https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle)

1.
    auto p = n.iota.array;
    foreach (pos; 0..n) {
        auto otherPos = uniform (0, pos + 1);
        swap (p[pos], p[otherPos]);
    }

When we look at this after k-th iteration, the first k elementsare randomly and uniformly permuted, and the rest (n-k) are leftuntouched.


2.
    auto p = n.iota.array;
    foreach (pos; 0..n) {
        auto otherPos = uniform (pos, n);
        swap (p[pos], p[otherPos]);
    }

When we look at this after k-th iteration, the first k elementsare a random combination of all n elements, and this combinationis randomly and uniformly permuted. So, the second variation iswhat we need: each new element is randomly and uniformly selectedfrom all the elements left. Once we get the first elementsatisfying the predicate, we can just terminate the loop. Ifthere are m out of n elements satisfying the predicate, theaverage number of steps is n/m.

Now, the problem is that both of these allocate n "size_t"-s ofmemory to start with. And your problem does not allow to shufflethe elements of the original array in place, so we do need anexternal permutation for these algorithms. However, there are atleast two ways to mitigate that:

(I)

We can allocate the permutation once using n time and memory, andthen, on every call, just reuse it in its current state in n/mtime. It does not matter if the permutation is not identitypermutation: by symmetry argument, any starting permutation willdo just fine.


(II)

We can store the permutation p in an associative array instead ofa regular array, actually storing only the elements accessed atleast once, and assuming other elements to satisfy the identityp[x] = x. So, if we finish in n/m steps on average, the time andextra memory used will be O(n/m) too. I can put together anexample implementation if this best satisfies your requirements.


Ivan Kazmenko.

Re: [OT] Algorithm question

Reply via email to