Samuel Lijin <sxli...@gmail.com> writes:

>>         /* And our single-liner comments look like this */
>>
>>> +             for (i = 0; i < dir->nr; i++) {
>>> +                     if (!dir->entries[i])
>>> +                             continue;
>>> +
>>> +                     for (j = i + 1; j < dir->nr; j++) {
>>> +                             if (!dir->entries[j])
>>> +                                     continue;
>>> +                             if (check_contains(dir->entries[i], 
>>> dir->entries[j])) {
>>> +                                     nr_removed++;
>>> +                                     free(dir->entries[j]);
>>> +                                     dir->entries[j] = NULL;
>>> +                             }
>>> +                             else {
>>> +                                     break;
>>> +                             }
>>> +                     }
>>> +             }
>>
>> This loop is O(n^2).  I wonder if we can do better, especially we
>> know dir->entries[] is sorted already.
>
> Now that I think about it, dropping an `i = j - 1` into the inner loop
> right before the break should work:

Actually, I think you should be able to do this in a single loop and
write in such a way that it is clearly O(N), perhaps like:

        for (src = dst = 0; src < nr; src++) {
                if (!dst ||
                    !is_a_directory(array[dst - 1]) ||
                    !contains(array[dst - 1], array[src])) {
                        array[dst++] = array[src];
                } else {
                        /* 
                         * Otherwise, src is inside (dst-1), so 
                         * we just can discard it.
                         */
                        free(array[src]);
                        array[src] = NULL; /* not needed */
                }
        }
        nr = dst;

That is, dst points at the next place to save the final elements of
the array, and dst-1 is the last one that is known to be the final
set.  src points at the element we are inspecting.

If dst-1 does not exist (i.e. we are looking at the first element),
if dst-1 is not a directory, or if dst-1 does not contain the src,
then we need to keep the src in the final set. Otherwise we discard.

Reply via email to