Derrick Stolee <dsto...@microsoft.com> writes:

> The 'verify' subcommand must compare the commit content parsed from the
> commit-graph and compare it against the content in the object database.

You have "compare" twice in the above sentence.

> Use lookup_commit() and parse_commit_in_graph_one() to parse the commits
> from the graph and compare against a commit that is loaded separately
> and parsed directly from the object database.

All right, that looks like a nice extension of what was done in previous
patch.  We want to check that cache (serialized commit graph) matches
reality (object database).

>
> Add checks for the root tree OID.

All right; isn't it that now we check almost all information from
commit-graph that hs match in object database (with exception of commit
parents, I think).

>
> Signed-off-by: Derrick Stolee <dsto...@microsoft.com>
> ---
>  commit-graph.c          | 17 ++++++++++++++++-
>  t/t5318-commit-graph.sh |  7 +++++++
>  2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 0420ebcd87..19ea369fc6 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -880,6 +880,8 @@ int verify_commit_graph(struct commit_graph *g)
>               return verify_commit_graph_error;

NOTE: we will be checking Commit Data chunk; I think it would be good
idea to verify that size of Commit Data chunk matches (N * (H + 16) bytes)
that format gives us, so that we don't accidentally red outside of
memory if something got screwed up (like number of commits being wrong,
or file truncated).

>  
>       for (i = 0; i < g->num_commits; i++) {
> +             struct commit *graph_commit;
> +
>               hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
>  
>               if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
> @@ -897,6 +899,11 @@ int verify_commit_graph(struct commit_graph *g)
>  
>                       cur_fanout_pos++;
>               }
> +
> +             graph_commit = lookup_commit(&cur_oid);

So now I see why we add all commits to memory (to hash structure).

> +             if (!parse_commit_in_graph_one(g, graph_commit))
> +                     graph_report("failed to parse %s from commit-graph",
> +                                  oid_to_hex(&cur_oid));

All right, this verifies that commit in OID Lookup chunk has parse-able
data in Commit Data chunk, isn't it?

>       }
>  
>       while (cur_fanout_pos < 256) {
> @@ -913,16 +920,24 @@ int verify_commit_graph(struct commit_graph *g)
>               return verify_commit_graph_error;
>  
>       for (i = 0; i < g->num_commits; i++) {
> -             struct commit *odb_commit;
> +             struct commit *graph_commit, *odb_commit;
>  
>               hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
>  
> +             graph_commit = lookup_commit(&cur_oid);
>               odb_commit = (struct commit *)create_object(cur_oid.hash, 
> alloc_commit_node());

All right, so we have commit data from graph, and commit data from the
object database.

>               if (parse_commit_internal(odb_commit, 0, 0)) {
>                       graph_report("failed to parse %s from object database",
>                                    oid_to_hex(&cur_oid));
>                       continue;
>               }
> +
> +             if (oidcmp(&get_commit_tree_in_graph_one(g, 
> graph_commit)->object.oid,
> +                        get_commit_tree_oid(odb_commit)))
> +                     graph_report("root tree OID for commit %s in 
> commit-graph is %s != %s",
> +                                  oid_to_hex(&cur_oid),
> +                                  
> oid_to_hex(get_commit_tree_oid(graph_commit)),
> +                                  
> oid_to_hex(get_commit_tree_oid(odb_commit)));

Maybe explicitly say that it doesn't match the value from the object
database; OTOH this may be too verbose.

>       }
>  
>       return verify_commit_graph_error;
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 996a016239..21cc8e82f3 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -267,6 +267,8 @@ GRAPH_BYTE_FANOUT2=`expr $GRAPH_FANOUT_OFFSET + 4 \* 255`
>  GRAPH_OID_LOOKUP_OFFSET=`expr $GRAPH_FANOUT_OFFSET + 4 \* 256`
>  GRAPH_BYTE_OID_LOOKUP_ORDER=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 8`
>  GRAPH_BYTE_OID_LOOKUP_MISSING=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 
> 4 + 10`
> +GRAPH_COMMIT_DATA_OFFSET=`expr $GRAPH_OID_LOOKUP_OFFSET + $HASH_LEN \* 
> $NUM_COMMITS`
> +GRAPH_BYTE_COMMIT_TREE=$GRAPH_COMMIT_DATA_OFFSET

All right, so the first is entry into record in Commit Data chunk, and
the latter points into tree entry in this record -- which entry is first
field:

  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
    * The first H bytes are for the OID of the root tree.

>  
>  # usage: corrupt_graph_and_verify <position> <data> <string>
>  # Manipulates the commit-graph file at the position
> @@ -341,4 +343,9 @@ test_expect_success 'detect OID not in object database' '
>               "from object database"
>  '
>  
> +test_expect_success 'detect incorrect tree OID' '
> +     corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_TREE "\01" \
> +             "root tree OID for commit"
> +'

All right.

I wonder if we can create a test for first check added (that Commit
Graph data parses correctly), that is the one with the following error
message:

  "failed to parse <OID> from commit-graph file".

> +
>  test_done

Regards,
-- 
Jakub Narębski

Reply via email to