Re: [PATCH v2 08/12] commit-graph: verify commit contents against odb
On 11 May 2018 at 23:15, Derrick Stoleewrote: I finally sat down today and familiarized myself with the commit-graph code a little. My biggest surprise was when I noticed that there is a hash checksum at the end of the commit-graph-file. That in combination with the tests where you flip some bytes... It turns out, if my reading is right, that the hash value is written as the commit-graph is generated, but that it is not verified as the commit-graph is later read. I could not find any mention of your plans here -- I understand why you would not want to verify the hash in `load_commit_graph_one()`, at least not in every run. Anyway, this is just an observation. Verifying the hash would affect the tests this series adds. They might need to rewrite the hash or set some magic environment variable. :-/ But that's for another day. > + for (i = 0; i < g->num_commits; i++) { > + struct commit *graph_commit, *odb_commit; > + struct commit_list *graph_parents, *odb_parents; > + int num_parents = 0; > + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); `num_commits` was derived as the commit-graph was loaded. It was derived from offsets which were verified to be in the mmap-ed memory. So this source address is guaranteed to be so, as well. Ok. (Once brian's latest series hits master, this could use `oidread(...)`.) > + graph_commit = lookup_commit(_oid); Do we know this comes from the graph? Even with a more-or-less-messed-up commit graph? See below. > + odb_commit = (struct commit *)create_object(cur_oid.hash, > alloc_commit_node()); > + if (parse_commit_internal(odb_commit, 0, 0)) { > + graph_report("failed to parse %s from object > database", oid_to_hex(_oid)); > + continue; > + } > + > + if (oidcmp(_commit_tree_in_graph_one(g, > graph_commit)->object.oid, > + get_commit_tree_oid(odb_commit))) `get_commit_tree_in_graph_one()` will BUG rather than return NULL. So this will not dereference NULL. Good. But might it hit the BUG? That is, can we trust the commit coming out of `lookup_commit()` not to have `graph_pos == COMMIT_NOT_FROM_GRAPH`? > + graph_report("root tree object ID for commit %s in > commit-graph is %s != %s", > +oid_to_hex(_oid), > + > oid_to_hex(get_commit_tree_oid(graph_commit)), > + > oid_to_hex(get_commit_tree_oid(odb_commit))); > + > + if (graph_commit->date != odb_commit->date) > + graph_report("commit date for commit %s in > commit-graph is %"PRItime" != %"PRItime"", > +oid_to_hex(_oid), > +graph_commit->date, > +odb_commit->date); > + > + (Extra blank line?) > + graph_parents = graph_commit->parents; > + odb_parents = odb_commit->parents; > + > + while (graph_parents) { > + num_parents++; > + > + if (odb_parents == NULL) > + graph_report("commit-graph parent list for > commit %s is too long (%d)", > +oid_to_hex(_oid), > +num_parents); > + > + if (oidcmp(_parents->item->object.oid, > _parents->item->object.oid)) > + graph_report("commit-graph parent for %s is > %s != %s", > +oid_to_hex(_oid), > + > oid_to_hex(_parents->item->object.oid), > + > oid_to_hex(_parents->item->object.oid)); > + > + graph_parents = graph_parents->next; > + odb_parents = odb_parents->next; > + } > + > + if (odb_parents != NULL) > + graph_report("commit-graph parent list for commit %s > terminates early", > +oid_to_hex(_oid)); Ok, ensure the lists are equally long and compare the entries. > + > + if (graph_commit->generation) { If the commit has a generation number (not an old commit graph)... > + uint32_t max_generation = 0; > + graph_parents = graph_commit->parents; > + > + while (graph_parents) { > + if (graph_parents->item->generation == > GENERATION_NUMBER_ZERO || > + graph_parents->item->generation == > GENERATION_NUMBER_INFINITY) > + graph_report("commit-graph has valid > generation for %s but not
Re: [PATCH v2 08/12] commit-graph: verify commit contents against odb
On 5/12/2018 5:17 PM, Martin Ågren wrote: On 11 May 2018 at 23:15, Derrick Stoleewrote: When running 'git commit-graph verify', compare the contents of the commits that are loaded from the commit-graph file with commits that are loaded directly from the object database. This includes checking the root tree object ID, commit date, and parents. Parse the commit from the graph during the initial loop through the object IDs to guarantee we parse from the commit-graph file. In addition, verify the generation number calculation is correct for all commits in the commit-graph file. While testing, we discovered that mutating the integer value for a parent to be outside the accepted range causes a segmentation fault. Add a new check in insert_parent_or_die() that prevents this fault. Check for that error during the test, both in the typical parents and in the list of parents for octopus merges. This paragraph and the corresponding fix and test feel like a separate patch to me. (The commit message of it could be "To test the next patch, we threw invalid data at `git commit-graph verify, and it crashed in pre-existing code, so let's fix that first" -- there is definitely a connection.) Is this important enough to fast-track to master in time for 2.18? My guess would be no. + + if (pos >= g->num_commits) + die("invalide parent position %"PRIu64, pos); s/invalide/invalid/ @@ -875,6 +879,8 @@ int verify_commit_graph(struct commit_graph *g) return 1; for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit; + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); if (i && oidcmp(_oid, _oid) >= 0) @@ -892,6 +898,10 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + + graph_commit = lookup_commit(_oid); + if (!parse_commit_in_graph_one(g, graph_commit)) + graph_report("failed to parse %s from commit-graph", oid_to_hex(_oid)); } Could this end up giving ridiculous amounts of output? It would depend on the input, I guess. while (cur_fanout_pos < 256) { @@ -904,5 +914,95 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + if (verify_commit_graph_error) + return 1; Well, here we give up before running into *too* much problem. + for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit, *odb_commit; + struct commit_list *graph_parents, *odb_parents; + int num_parents = 0; + + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + graph_commit = lookup_commit(_oid); + odb_commit = (struct commit *)create_object(cur_oid.hash, alloc_commit_node()); + if (parse_commit_internal(odb_commit, 0, 0)) { + graph_report("failed to parse %s from object database", oid_to_hex(_oid)); + continue; + } + + if (oidcmp(_commit_tree_in_graph_one(g, graph_commit)->object.oid, + get_commit_tree_oid(odb_commit))) + graph_report("root tree object ID for commit %s in commit-graph is %s != %s", +oid_to_hex(_oid), + oid_to_hex(get_commit_tree_oid(graph_commit)), + oid_to_hex(get_commit_tree_oid(odb_commit))); + + if (graph_commit->date != odb_commit->date) + graph_report("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime"", +oid_to_hex(_oid), +graph_commit->date, +odb_commit->date); + + + graph_parents = graph_commit->parents; + odb_parents = odb_commit->parents; + + while (graph_parents) { + num_parents++; + + if (odb_parents == NULL) + graph_report("commit-graph parent list for commit %s is too long (%d)", +oid_to_hex(_oid), +num_parents); + + if (oidcmp(_parents->item->object.oid, _parents->item->object.oid)) + graph_report("commit-graph parent for %s is %s != %s", +oid_to_hex(_oid), + oid_to_hex(_parents->item->object.oid), + oid_to_hex(_parents->item->object.oid)); + + graph_parents = graph_parents->next; + odb_parents = odb_parents->next; + } +
Re: [PATCH v2 08/12] commit-graph: verify commit contents against odb
On 11 May 2018 at 23:15, Derrick Stoleewrote: > When running 'git commit-graph verify', compare the contents of the > commits that are loaded from the commit-graph file with commits that are > loaded directly from the object database. This includes checking the > root tree object ID, commit date, and parents. > > Parse the commit from the graph during the initial loop through the > object IDs to guarantee we parse from the commit-graph file. > > In addition, verify the generation number calculation is correct for all > commits in the commit-graph file. > > While testing, we discovered that mutating the integer value for a > parent to be outside the accepted range causes a segmentation fault. Add > a new check in insert_parent_or_die() that prevents this fault. Check > for that error during the test, both in the typical parents and in the > list of parents for octopus merges. This paragraph and the corresponding fix and test feel like a separate patch to me. (The commit message of it could be "To test the next patch, we threw invalid data at `git commit-graph verify, and it crashed in pre-existing code, so let's fix that first" -- there is definitely a connection.) Is this important enough to fast-track to master in time for 2.18? My guess would be no. > + > + if (pos >= g->num_commits) > + die("invalide parent position %"PRIu64, pos); s/invalide/invalid/ > @@ -875,6 +879,8 @@ int verify_commit_graph(struct commit_graph *g) > return 1; > > for (i = 0; i < g->num_commits; i++) { > + struct commit *graph_commit; > + > hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); > > if (i && oidcmp(_oid, _oid) >= 0) > @@ -892,6 +898,10 @@ int verify_commit_graph(struct commit_graph *g) > > cur_fanout_pos++; > } > + > + graph_commit = lookup_commit(_oid); > + if (!parse_commit_in_graph_one(g, graph_commit)) > + graph_report("failed to parse %s from commit-graph", > oid_to_hex(_oid)); > } Could this end up giving ridiculous amounts of output? It would depend on the input, I guess. > while (cur_fanout_pos < 256) { > @@ -904,5 +914,95 @@ int verify_commit_graph(struct commit_graph *g) > cur_fanout_pos++; > } > > + if (verify_commit_graph_error) > + return 1; Well, here we give up before running into *too* much problem. > + for (i = 0; i < g->num_commits; i++) { > + struct commit *graph_commit, *odb_commit; > + struct commit_list *graph_parents, *odb_parents; > + int num_parents = 0; > + > + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); > + > + graph_commit = lookup_commit(_oid); > + odb_commit = (struct commit *)create_object(cur_oid.hash, > alloc_commit_node()); > + if (parse_commit_internal(odb_commit, 0, 0)) { > + graph_report("failed to parse %s from object > database", oid_to_hex(_oid)); > + continue; > + } > + > + if (oidcmp(_commit_tree_in_graph_one(g, > graph_commit)->object.oid, > + get_commit_tree_oid(odb_commit))) > + graph_report("root tree object ID for commit %s in > commit-graph is %s != %s", > +oid_to_hex(_oid), > + > oid_to_hex(get_commit_tree_oid(graph_commit)), > + > oid_to_hex(get_commit_tree_oid(odb_commit))); > + > + if (graph_commit->date != odb_commit->date) > + graph_report("commit date for commit %s in > commit-graph is %"PRItime" != %"PRItime"", > +oid_to_hex(_oid), > +graph_commit->date, > +odb_commit->date); > + > + > + graph_parents = graph_commit->parents; > + odb_parents = odb_commit->parents; > + > + while (graph_parents) { > + num_parents++; > + > + if (odb_parents == NULL) > + graph_report("commit-graph parent list for > commit %s is too long (%d)", > +oid_to_hex(_oid), > +num_parents); > + > + if (oidcmp(_parents->item->object.oid, > _parents->item->object.oid)) > + graph_report("commit-graph parent for %s is > %s != %s", > +oid_to_hex(_oid), > + > oid_to_hex(_parents->item->object.oid), > + > oid_to_hex(_parents->item->object.oid)); > +
[PATCH v2 08/12] commit-graph: verify commit contents against odb
When running 'git commit-graph verify', compare the contents of the commits that are loaded from the commit-graph file with commits that are loaded directly from the object database. This includes checking the root tree object ID, commit date, and parents. Parse the commit from the graph during the initial loop through the object IDs to guarantee we parse from the commit-graph file. In addition, verify the generation number calculation is correct for all commits in the commit-graph file. While testing, we discovered that mutating the integer value for a parent to be outside the accepted range causes a segmentation fault. Add a new check in insert_parent_or_die() that prevents this fault. Check for that error during the test, both in the typical parents and in the list of parents for octopus merges. Signed-off-by: Derrick Stolee--- commit-graph.c | 100 t/t5318-commit-graph.sh | 64 +++ 2 files changed, 164 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 5bb93e533c..a15ad9710d 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -237,6 +237,10 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g, { struct commit *c; struct object_id oid; + + if (pos >= g->num_commits) + die("invalide parent position %"PRIu64, pos); + hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos); c = lookup_commit(); if (!c) @@ -875,6 +879,8 @@ int verify_commit_graph(struct commit_graph *g) return 1; for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit; + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); if (i && oidcmp(_oid, _oid) >= 0) @@ -892,6 +898,10 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + + graph_commit = lookup_commit(_oid); + if (!parse_commit_in_graph_one(g, graph_commit)) + graph_report("failed to parse %s from commit-graph", oid_to_hex(_oid)); } while (cur_fanout_pos < 256) { @@ -904,5 +914,95 @@ int verify_commit_graph(struct commit_graph *g) cur_fanout_pos++; } + if (verify_commit_graph_error) + return 1; + + for (i = 0; i < g->num_commits; i++) { + struct commit *graph_commit, *odb_commit; + struct commit_list *graph_parents, *odb_parents; + int num_parents = 0; + + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); + + graph_commit = lookup_commit(_oid); + odb_commit = (struct commit *)create_object(cur_oid.hash, alloc_commit_node()); + if (parse_commit_internal(odb_commit, 0, 0)) { + graph_report("failed to parse %s from object database", oid_to_hex(_oid)); + continue; + } + + if (oidcmp(_commit_tree_in_graph_one(g, graph_commit)->object.oid, + get_commit_tree_oid(odb_commit))) + graph_report("root tree object ID for commit %s in commit-graph is %s != %s", +oid_to_hex(_oid), + oid_to_hex(get_commit_tree_oid(graph_commit)), + oid_to_hex(get_commit_tree_oid(odb_commit))); + + if (graph_commit->date != odb_commit->date) + graph_report("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime"", +oid_to_hex(_oid), +graph_commit->date, +odb_commit->date); + + + graph_parents = graph_commit->parents; + odb_parents = odb_commit->parents; + + while (graph_parents) { + num_parents++; + + if (odb_parents == NULL) + graph_report("commit-graph parent list for commit %s is too long (%d)", +oid_to_hex(_oid), +num_parents); + + if (oidcmp(_parents->item->object.oid, _parents->item->object.oid)) + graph_report("commit-graph parent for %s is %s != %s", +oid_to_hex(_oid), + oid_to_hex(_parents->item->object.oid), + oid_to_hex(_parents->item->object.oid)); + + graph_parents = graph_parents->next; + odb_parents = odb_parents->next; + } + + if (odb_parents != NULL) +