> The first step includes using a depth-first-search (DFS) from each from
> commit, sorted by ascending generation number. We do not walk beyond the
> minimum generation number or the minimum commit date. This DFS is likely
> to be faster than the existing reachable() method because we expect
> previous ref values to be along the first-parent history.
> 
> If we find a target commit, then we mark everything in the DFS stack as
> a RESULT. This expands the set of targets for the other from commits. We
> also mark the visited commits using 'assign_flag' to prevent re-walking
> the same code.

Thanks for this - it was very helpful in understanding the code.

The function itself uses a DFS stack that contains only the trail
leading up to the currently processed node, and not the one that I'm
more familiar with, which also contains the siblings of processed nodes.
I'll annotate the function with my thought process in the hope that it
will aid future reviewers. (The diff as seen in the e-mail is confusing
so I'm reproducing the function itself, not any + or -.)

> int can_all_from_reach_with_flag(struct object_array *from,
>                                int with_flag, int assign_flag,
>                                time_t min_commit_date,
>                                uint32_t min_generation)
> {
>       struct commit **list = NULL;
>       int i;
>       int result = 1;
> 
>       ALLOC_ARRAY(list, from->nr);
>       for (i = 0; i < from->nr; i++) {
>               list[i] = (struct commit *)from->objects[i].item;
> 
>               parse_commit(list[i]);
> 
>               if (list[i]->generation < min_generation)
>                       return 0;
>       }
> 
>       QSORT(list, from->nr, compare_commits_by_gen);
> 
>       for (i = 0; i < from->nr; i++) {
>               /* DFS from list[i] */
>               struct commit_list *stack = NULL;
> 
>               list[i]->object.flags |= assign_flag;
>               commit_list_insert(list[i], &stack);
> 
>               while (stack) {
>                       struct commit_list *parent;
> 
>                       if (stack->item->object.flags & with_flag) {
>                               pop_commit(&stack);
>                               continue;
>                       }

I wish that the code would refrain from pushing such an object instead
of popping it at the first opportunity, but I guess that doing so would
require the equivalent of a labeled break/continue. I have no qualms
with using "goto" in this case, but I know that some people don't like
it :-P

>                       for (parent = stack->item->parents; parent; parent = 
> parent->next) {
>                               if (parent->item->object.flags & (with_flag | 
> RESULT))
>                                       stack->item->object.flags |= RESULT;

Straightforward, and also produces the bubbling up that we want. An
object is never popped unless it has the "with_flag" flag (see above) or
all its parents have been processed. The object can encounter the "if"
statement multiple times; the last one is when all its parents have been
processed (and thus have the RESULT flag set if necessary).

>                               if (!(parent->item->object.flags & 
> assign_flag)) {
>                                       parent->item->object.flags |= 
> assign_flag;
> 
>                                       parse_commit(parent->item);
> 
>                                       if (parent->item->date < 
> min_commit_date ||
>                                           parent->item->generation < 
> min_generation)
>                                               continue;
> 
>                                       commit_list_insert(parent->item, 
> &stack);
>                                       break;
>                               }

If not yet processed, push it onto the stack and break. The child commit
is still left on the stack. The next time the child commit is processed
(in an iteration of the "while" loop), the "for" loop will iterate until
the next unprocessed parent.

In the DFS that I'm used to, all parents would be pushed here, but
perhaps the fact that the iteration is postorder confuses things.
Anyway, if someone comes up with a better algorithm, replacing it
shouldn't be too difficult - the algorithm is contained within this
function, and there are tests to check the correctness of the algorithm
update.

>                       }
> 
>                       if (!parent)
>                               pop_commit(&stack);

Only when we have no parents left are we completely done with the
current object.

>               }
> 
>               if (!(list[i]->object.flags & (with_flag | RESULT))) {
>                       result = 0;
>                       goto cleanup;
>               }

And after the DFS, if the original object did not have an appropriate
flag set, we do not bother with the other "want" objects.

>       }
> 
> cleanup:
>       for (i = 0; i < from->nr; i++) {
>               clear_commit_marks(list[i], RESULT);
>               clear_commit_marks(list[i], assign_flag);
>       }
>       return result;
> }

Reply via email to