On 4/10/2018 10:31 PM, Junio C Hamano wrote:
Derrick Stolee <dsto...@microsoft.com> writes:

The generation number of a commit is defined recursively as follows:

* If a commit A has no parents, then the generation number of A is one.
* If a commit A has parents, then the generation number of A is one
   more than the maximum generation number among the parents of A.

Add a uint32_t generation field to struct commit so we can pass this
information to revision walks. We use two special values to signal
the generation number is invalid:

GENERATION_NUMBER_ININITY 0xFFFFFFFF
GENERATION_NUMBER_ZERO 0

The first (_INFINITY) means the generation number has not been loaded or
computed. The second (_ZERO) means the generation number was loaded
from a commit graph file that was stored before generation numbers
were computed.
Should it also be possible for a caller to tell if a given commit
has too deep a history, i.e. we do not know its generation number
exactly, but we know it is larger than 1<<30?

It seems that we only have a 30-bit field in the file, so wouldn't
we need a special value defined in (e.g. "0") so that we can tell
that the commit has such a large generation number?  E.g.

+       item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
        if (!item->generation)
                item->generation = GENERATION_NUMBER_OVERFLOW;

when we read it from the file?

We obviously need to do something similar when assigning a
generation number to a child commit, perhaps like

        #define GENERATION_NUMBER_OVERFLOW (GENERATION_NUMBER_MAX + 1)

        commit->generation = 1; /* assume no parent */
        for (p = commit->parents; p; p++) {
                uint32_t gen = p->item->generation + 1;

                if (gen >= GENERATION_NUMBER_OVERFLOW) {
                        commit->generation = GENERATION_NUMBER_OVERFLOW;
                        break;
                } else if (commit->generation < gen)
                        commit->generation = gen;
        }
or something? And then on the writing side you'd encode too large a
generation as '0'.

You raise a very good point. How about we do a slightly different arrangement for these overflow commits?

Instead of storing the commits in the commit-graph file as "0" (which currently means "written by a version of git that did not compute generation numbers") we could let GENERATION_NUMBER_MAX be the maximum generation of a commit in the commit-graph, and if a commit would have larger generation, we collapse it down to that value.

It slightly complicates the diagram I made in Documentation/technical/commit-graph.txt, but it was already a bit of a simplification. Here is an updated diagram, but likely we will want to limit discussion of the special-case GENERATION_NUMBER_MAX to the prose, since it is not a practical situation at the moment.

    +-----------------------------------------+
    | GENERATION_NUMBER_INFINITY = 0xFFFFFFFF |
    +-----------------------------------------+
      |    |            |      ^
      |    |            |      |
      |    |            +------+
      |    |         [gen(A) = gen(B)]
      |    V
      |  +------------------------------------+
      |  | GENERATION_NUMBER_MAX = 0x3FFFFFFF |
      |  +------------------------------------+
      |    |            |      ^
      |    |            |      |
      |    |            +------+
      |    |         [gen(A) = gen(B)]
      V    V
    +-------------------------------------+
    | 0 < commit->generation < 0x3FFFFFFF |
    +-------------------------------------+
        |            |      ^
        |            |      |
        |            +------+
        |        [gen(A) > gen(B)]
        V
    +-------------------------------------+
    | GENERATION_NUMBER_ZERO = 0          |
    +-------------------------------------+
             |      ^
             |      |
             +------+
             [gen(A) = gen(B)]

Thanks,
-Stolee

Reply via email to