Derrick Stolee <dsto...@microsoft.com> writes:

> The commit-graph file requires the following three chunks:
>
> * OID Fanout
> * OID Lookup
> * Commit Data
>
> If any of these are missing, then the 'verify' subcommand should
> report a failure. This includes the chunk IDs malformed or the
> chunk count is truncated.

Minor nit: it should IMVHO either be "or the chunk count truncated", or
"or when the chunk count is truncated".

>
> Signed-off-by: Derrick Stolee <dsto...@microsoft.com>
> ---
>  commit-graph.c          |  9 +++++++++
>  t/t5318-commit-graph.sh | 29 +++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 55b41664ee..06e3e4f9ba 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -860,5 +860,14 @@ int verify_commit_graph(struct commit_graph *g)
>               return 1;
>       }
>  
> +     verify_commit_graph_error = 0;
> +

By the way, if chunk count is less than 3, then by pigeonhole principle
at least one required chunk is missing.

> +     if (!g->chunk_oid_fanout)
> +             graph_report("commit-graph is missing the OID Fanout chunk");
> +     if (!g->chunk_oid_lookup)
> +             graph_report("commit-graph is missing the OID Lookup chunk");
> +     if (!g->chunk_commit_data)
> +             graph_report("commit-graph is missing the Commit Data chunk");

Nice and simple.  Good.

> +
>       return verify_commit_graph_error;
>  }
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index bd64481c7a..4ef3fe3dc2 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -249,6 +249,15 @@ test_expect_success 'git commit-graph verify' '
>  
>  GRAPH_BYTE_VERSION=4
>  GRAPH_BYTE_HASH=5
> +GRAPH_BYTE_CHUNK_COUNT=6
> +GRAPH_CHUNK_LOOKUP_OFFSET=8
> +GRAPH_CHUNK_LOOKUP_WIDTH=12
> +GRAPH_CHUNK_LOOKUP_ROWS=5
> +GRAPH_BYTE_OID_FANOUT_ID=$GRAPH_CHUNK_LOOKUP_OFFSET
> +GRAPH_BYTE_OID_LOOKUP_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \
> +                           1 \* $GRAPH_CHUNK_LOOKUP_WIDTH`
> +GRAPH_BYTE_COMMIT_DATA_ID=`expr $GRAPH_CHUNK_LOOKUP_OFFSET + \
> +                             2 \* $GRAPH_CHUNK_LOOKUP_WIDTH`
>  
>  # usage: corrupt_graph_and_verify <position> <data> <string>
>  # Manipulates the commit-graph file at the position
> @@ -283,4 +292,24 @@ test_expect_success 'detect bad hash version' '
>               "hash version"
>  '
>  
> +test_expect_success 'detect bad chunk count' '
> +     corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\02" \
> +             "missing the Commit Data chunk"
> +'

As I wrote before, this test assumes that the last chunk (the one not
counted because of changed / corrupted chunk count) is the Commit Data
chunk.  This may be true for corrent implementation, but it is not
required by the format.

Better solution would be to check for "missing the .* chunk"; as I
understand you can pass the regexp to grep, not only strings.


Another thing would be to check if there are gaps in the file, or if the
whole file is being used.  Changing chunk count to a smaller number
would mean that chunks would not cover the rest of files.

By the way, would the following be detected:

        corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\05"

that is corrupting the chunk count to be larger than the number of
actual chunks?  Or is it left for later?

> +
> +test_expect_success 'detect missing OID fanout chunk' '
> +     corrupt_graph_and_verify $GRAPH_BYTE_OID_FANOUT_ID "\0" \

We could have used "X" or " " in place of "\0", but admittedly the
latter is a better check - it also checks if there are problems with
handling of NUL character ("\0") in chunk names.

> +             "missing the OID Fanout chunk"
> +'
> +
> +test_expect_success 'detect missing OID lookup chunk' '
> +     corrupt_graph_and_verify $GRAPH_BYTE_OID_LOOKUP_ID "\0" \
> +             "missing the OID Lookup chunk"
> +'
> +
> +test_expect_success 'detect missing commit data chunk' '
> +     corrupt_graph_and_verify $GRAPH_BYTE_COMMIT_DATA_ID "\0" \
> +             "missing the Commit Data chunk"
> +'

What happens if the terminating pseudo-chunk name "\0\0\0\0" gets
corrupted?  Would it be detected (or maybe it is handled by later patch
in the series)?

> +
>  test_done

Reply via email to