Junio C Hamano <[email protected]> writes:
> Derrick Stolee <[email protected]> writes:
>
>> Define compare_commits_by_gen_then_commit_date(), which uses generation
>> numbers as a primary comparison and commit date to break ties (or as a
>> comparison when both commits do not have computed generation numbers).
>>
>> Since the commit-graph file is closed under reachability, we know that
>> all commits in the file have generation at most GENERATION_NUMBER_MAX
>> which is less than GENERATION_NUMBER_INFINITY.
>
> I suspect that my puzzlement may be coming from my not "getting"
> what you meant by "closed under reachability",
It means that if commit A is in the commit graph, then all of its
ancestors (all commits reachable from A) are also in the commit graph.
> but could you also
> explain how _INF and _ZERO interact with commits with normal
> generation numbers? I've always assumed that genno will be used
> only when comparing two commits with valid genno and otherwise we'd
> fall back to the traditional date based one, but...
>
>> +int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_,
>> void *unused)
>> +{
>> + const struct commit *a = a_, *b = b_;
>> +
>> + /* newer commits first */
>> + if (a->generation < b->generation)
>> + return 1;
>> + else if (a->generation > b->generation)
>> + return -1;
>
> ... this does not check if a->generation is _ZERO or _INF.
>
> Both being _MAX is OK (the control will fall through and use the
> dates below). One being _MAX and the other being a normal value is
> also OK (the above comparisons will declare the commit with _MAX is
> farther than less-than-max one from a root).
>
> Or is the assumption that if one has _ZERO, that must have come from
> an ancient commit-graph file and none of the commits have anything
> but _ZERO?
There is stronger and weaker version of the negative-cut criteria based
on generation numbers.
The strong criteria:
if A != B and gen(A) <= gen(B), then A cannot reach B
The weaker criteria:
if gen(A) < gen(B), then A cannot reach B
Because commit-graph is closed under reachability, this means that
if A is in commit graph, and B is outside of it, then A cannot reach B
If A is in commit graph, then either _MAX >= gen(A) >= 1,
or gen(A) == _ZERO. Because _INFINITY > _MAX > _ZERO, then we have
if _MAX >= gen(A) >= 1 || gen(A) == 0, and gen(B) == _INFINITY
then A cannot reach B
which also fullfils the weaker criteria
if gen(A) < gen(B), then A cannot reach B
If both A and B are outside commit-graph, i.e. gen(A) = gen(B) = _INFINITY,
or if both A and B have gen(A) = gen(B) = _MAX,
or if both A and B come from old commit graph with gen(A) = gen(B) =_ZERO,
then we cannot say anything about reachability... and weak criteria
also does not say anything about reachability.
Maybe the following ASCII table would make it clear.
| gen(B)
| ................................ :::::::
gen(A) | _INFINITY | _MAX | larger | smaller | _ZERO
-------------+-----------+----------+----------+----------+--------
_INFINITY | = | > | > | > | >
_MAX | < Nn | = | > | > | >
larger | < Nn | < Nn | = n | > | >
smaller | < Nn | < Nn | < Nn | = n | >
_ZERO | < Nn | < Nn | < Nn | < Nn | =
Here "n" denotes stronger condition, and "N" denotes weaker condition.
We have _INFINITY > _MAX > larger > smaller > _ZERO.
NOTE however that it is a *tradeoff*. Using weaker criteria, with
strict inequality, means that we don't need to handle _INFINITY, _MAX
and _ZERO corner-cases in a special way; but it also means that we would
walk slightly more commits than if we used stronger criteria, with less
or equals.
For Linux kernel public repository commit graph[1] we have maximum of 512
commits sharing the same level, 5.43 sharing the same commit on average,
and 50% of time only 2 commits sharing the same level (median, or 2nd
quartile, or 50% percentile). This is roughly the amount of commits we
walk more with weaker cut-off condition.
[1]: with 750k commits, but which is not largest commit graph any more :-0
>> + /* use date as a heuristic when generations are equal */
>> + if (a->date < b->date)
>> + return 1;
>> + else if (a->date > b->date)
>> + return -1;
>> + return 0;
>> +}
HTH
--
Jakub Narębski