Derrick Stolee <[email protected]> writes:
> The generation number of a commit is defined recursively as follows:
>
> * If a commit A has no parents, then the generation number of A is one.
> * If a commit A has parents, then the generation number of A is one
> more than the maximum generation number among the parents of A.
Very minor nitpick: it would be more readable wrapped differently:
* If a commit A has parents, then the generation number of A is
one more than the maximum generation number among parents of A.
Very minor nitpick: possibly "parents", not "the parents", but I am
not native English speaker.
>
> Add a uint32_t generation field to struct commit so we can pass this
> information to revision walks. We use three special values to signal
> the generation number is invalid:
>
> GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> GENERATION_NUMBER_MAX 0x3FFFFFFF
> GENERATION_NUMBER_ZERO 0
>
> The first (_INFINITY) means the generation number has not been loaded or
> computed. The second (_MAX) means the generation number is too large to
> store in the commit-graph file. The third (_ZERO) means the generation
> number was loaded from a commit graph file that was written by a version
> of git that did not support generation numbers.
Good explanation; I wonder if we want to have it in some shortened form
also in comments, and not only in the commit message.
>
> Signed-off-by: Derrick Stolee <[email protected]>
> ---
> alloc.c | 1 +
> commit-graph.c | 2 ++
> commit.h | 4 ++++
> 3 files changed, 7 insertions(+)
I have reordered patches to make it easier to review.
> diff --git a/commit.h b/commit.h
> index 23a3f364ed..aac3b8c56f 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -10,6 +10,9 @@
> #include "pretty.h"
>
> #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
> +#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> +#define GENERATION_NUMBER_MAX 0x3FFFFFFF
> +#define GENERATION_NUMBER_ZERO 0
I wonder if it wouldn't be good to have some short in-line comments
explaining those constants, or a block comment above them.
>
> struct commit_list {
> struct commit *item;
> @@ -30,6 +33,7 @@ struct commit {
> */
> struct tree *maybe_tree;
> uint32_t graph_pos;
> + uint32_t generation;
> };
>
> extern int save_commit_buffer;
All right, simple addition of the new field. Nothing to go wrong here.
Sidenote: With 0x7FFFFFFF being (if I am not wrong) maximum graph_pos
and maximum number of nodes in commit graph, we won't hit 0x3FFFFFFF
generation number limit for all except very, very linear histories.
>
> diff --git a/alloc.c b/alloc.c
> index cf4f8b61e1..e8ab14f4a1 100644
> --- a/alloc.c
> +++ b/alloc.c
> @@ -94,6 +94,7 @@ void *alloc_commit_node(void)
> c->object.type = OBJ_COMMIT;
> c->index = alloc_commit_index();
> c->graph_pos = COMMIT_NOT_FROM_GRAPH;
> + c->generation = GENERATION_NUMBER_INFINITY;
> return c;
> }
All right, start with initializing it with "not from commit-graph" value
after allocation.
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 70fa1b25fd..9ad21c3ffb 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -262,6 +262,8 @@ static int fill_commit_in_graph(struct commit *item,
> struct commit_graph *g, uin
> date_low = get_be32(commit_data + g->hash_len + 12);
> item->date = (timestamp_t)((date_high << 32) | date_low);
>
> + item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> +
I guess we should not worry about these "magical constants" sprinkled
here, like "+ 8" above.
Let's examine how it goes, taking a look at commit-graph-format.txt
in Documentation/technical/commit-graph-format.txt
* The first H (g->hash_len) bytes are for the OID of the root tree.
* The next 8 bytes are for the positions of the first two parents [...]
So 'commit_data + g->hash_len + 8' is our offset from the start of
commit data. All right.
* The next 8 bytes store the generation number of the commit and
the commit time in seconds since EPOCH. The generation number
uses the higher 30 bits of the first 4 bytes. [...]
The higher 30 bits of the 4 bytes, which is 32 bits, means that we need
to shift 32-bit value 2 bits right, so that we get lower 30 bits of
32-bit value. All right.
All 4-byte numbers are in network order.
Shouldn't it be ntohl() to convert from network order to host order, and
not get_be32()? I guess they are the same (network order is big-endian
order), and get_be32() is what rest of git uses...
Looks all right.
> pptr = &item->parents;
>
> edge_value = get_be32(commit_data + g->hash_len);