Derrick Stolee <[email protected]> writes:
> On 4/18/2018 5:02 PM, Jakub Narebski wrote:
>> Derrick Stolee <[email protected]> writes:
>>
>>> A commit A can reach a commit B only if the generation number of A
>>> is larger than the generation number of B. This condition allows
>>> significantly short-circuiting commit-graph walks.
>>>
>>> Use generation number for 'git tag --contains' queries.
>>>
>>> On a copy of the Linux repository where HEAD is containd in v4.13
>>> but no earlier tag, the command 'git tag --contains HEAD' had the
>>> following peformance improvement:
>>>
>>> Before: 0.81s
>>> After: 0.04s
>>> Rel %: -95%
>>
>> A question: what is the performance after if the "commit-graph" feature
>> is disabled, or there is no commit-graph file? Is there performance
>> regression in this case, or is the difference negligible?
>
> Negligible, since we are adding a small number of integer comparisons
> and the main cost is in commit parsing. More on commit parsing in
> response to your comments below.
If it is proven to be always negligible, then its all right. If it is
unlikely to be non-negligible, well, still O.K. But I wonder if maybe
there is some situation where the cost of extra parsing is non-negligble.
[...]
>>> @@ -1618,8 +1623,18 @@ static enum contains_result
>>> contains_tag_algo(struct commit *candidate,
>>> struct contains_cache *cache)
>>> {
>>> struct contains_stack contains_stack = { 0, 0, NULL };
>>> - enum contains_result result = contains_test(candidate, want, cache);
>>> + enum contains_result result;
>>> + uint32_t cutoff = GENERATION_NUMBER_INFINITY;
>>> + const struct commit_list *p;
>>> +
>>> + for (p = want; p; p = p->next) {
>>> + struct commit *c = p->item;
>>> + parse_commit_or_die(c);
>>> + if (c->generation < cutoff)
>>> + cutoff = c->generation;
>>> + }
>> Sholdn't the above be made conditional on the ability to get generation
>> numbers from the commit-graph file (feature is turned on and file
>> exists)? Otherwise here after the change contains_tag_algo() now parses
>> each commit in 'want', which I think was not done previously.
>>
>> With commit-graph file parsing is [probably] cheap. Without it, not
>> necessary.
>>
>> But I might be worrying about nothing.
>
> Not nothing. This parses the "wants" when we previously did not parse
> the wants. Further: this parsing happens before we do the simple check
> of comparing the OID of the candidate against the wants.
>
> The question is: are these parsed commits significant compared to the
> walk that will parse many more commits? It is certainly possible.
>
> One way to fix this is to call 'prepare_commit_graph()' directly and
> then test that 'commit_graph' is non-null before performing any
> parses. I'm not thrilled with how that couples the commit-graph
> implementation to this feature, but that may be necessary to avoid
> regressions in the non-commit-graph case.
Another possible solution (not sure if better or worse) would be to
change the signature of contains_tag_algo() function to take parameter
or flag that would decide whether to parse "wants".
Best,
--
Jakub Narębski