On 11 May 2018 at 23:15, Derrick Stolee <dsto...@microsoft.com> wrote:
When running 'git commit-graph verify', compare the contents of the
commits that are loaded from the commit-graph file with commits that are
loaded directly from the object database. This includes checking the
root tree object ID, commit date, and parents.
Parse the commit from the graph during the initial loop through the
object IDs to guarantee we parse from the commit-graph file.
In addition, verify the generation number calculation is correct for all
commits in the commit-graph file.
While testing, we discovered that mutating the integer value for a
parent to be outside the accepted range causes a segmentation fault. Add
a new check in insert_parent_or_die() that prevents this fault. Check
for that error during the test, both in the typical parents and in the
list of parents for octopus merges.
This paragraph and the corresponding fix and test feel like a separate
patch to me. (The commit message of it could be "To test the next patch,
we threw invalid data at `git commit-graph verify, and it crashed in
pre-existing code, so let's fix that first" -- there is definitely a
connection.) Is this important enough to fast-track to master in time
for 2.18? My guess would be no.
+
+ if (pos >= g->num_commits)
+ die("invalide parent position %"PRIu64, pos);
s/invalide/invalid/
@@ -875,6 +879,8 @@ int verify_commit_graph(struct commit_graph *g)
return 1;
for (i = 0; i < g->num_commits; i++) {
+ struct commit *graph_commit;
+
hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
if (i && oidcmp(&prev_oid, &cur_oid) >= 0)
@@ -892,6 +898,10 @@ int verify_commit_graph(struct commit_graph *g)
cur_fanout_pos++;
}
+
+ graph_commit = lookup_commit(&cur_oid);
+ if (!parse_commit_in_graph_one(g, graph_commit))
+ graph_report("failed to parse %s from commit-graph",
oid_to_hex(&cur_oid));
}
Could this end up giving ridiculous amounts of output? It would depend
on the input, I guess.
while (cur_fanout_pos < 256) {
@@ -904,5 +914,95 @@ int verify_commit_graph(struct commit_graph *g)
cur_fanout_pos++;
}
+ if (verify_commit_graph_error)
+ return 1;
Well, here we give up before running into *too* much problem.
+ for (i = 0; i < g->num_commits; i++) {
+ struct commit *graph_commit, *odb_commit;
+ struct commit_list *graph_parents, *odb_parents;
+ int num_parents = 0;
+
+ hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
+
+ graph_commit = lookup_commit(&cur_oid);
+ odb_commit = (struct commit *)create_object(cur_oid.hash,
alloc_commit_node());
+ if (parse_commit_internal(odb_commit, 0, 0)) {
+ graph_report("failed to parse %s from object database",
oid_to_hex(&cur_oid));
+ continue;
+ }
+
+ if (oidcmp(&get_commit_tree_in_graph_one(g,
graph_commit)->object.oid,
+ get_commit_tree_oid(odb_commit)))
+ graph_report("root tree object ID for commit %s in
commit-graph is %s != %s",
+ oid_to_hex(&cur_oid),
+
oid_to_hex(get_commit_tree_oid(graph_commit)),
+
oid_to_hex(get_commit_tree_oid(odb_commit)));
+
+ if (graph_commit->date != odb_commit->date)
+ graph_report("commit date for commit %s in commit-graph is %"PRItime"
!= %"PRItime"",
+ oid_to_hex(&cur_oid),
+ graph_commit->date,
+ odb_commit->date);
+
+
+ graph_parents = graph_commit->parents;
+ odb_parents = odb_commit->parents;
+
+ while (graph_parents) {
+ num_parents++;
+
+ if (odb_parents == NULL)
+ graph_report("commit-graph parent list for commit %s
is too long (%d)",
+ oid_to_hex(&cur_oid),
+ num_parents);
+
+ if (oidcmp(&graph_parents->item->object.oid,
&odb_parents->item->object.oid))
+ graph_report("commit-graph parent for %s is %s !=
%s",
+ oid_to_hex(&cur_oid),
+
oid_to_hex(&graph_parents->item->object.oid),
+
oid_to_hex(&odb_parents->item->object.oid));
+
+ graph_parents = graph_parents->next;
+ odb_parents = odb_parents->next;
+ }
+
+ if (odb_parents != NULL)
+ graph_report("commit-graph parent list for commit %s
terminates early",
+ oid_to_hex(&cur_oid));
+
+ if (graph_commit->generation) {
+ uint32_t max_generation = 0;
+ graph_parents = graph_commit->parents;
+
+ while (graph_parents) {
+ if (graph_parents->item->generation ==
GENERATION_NUMBER_ZERO ||
+ graph_parents->item->generation ==
GENERATION_NUMBER_INFINITY)
+ graph_report("commit-graph has valid
generation for %s but not its parent, %s",
+ oid_to_hex(&cur_oid),
+
oid_to_hex(&graph_parents->item->object.oid));
+ if (graph_parents->item->generation >
max_generation)
+ max_generation =
graph_parents->item->generation;
+ graph_parents = graph_parents->next;
+ }
+
+ if (max_generation == GENERATION_NUMBER_MAX)
+ max_generation--;
I'm not too familiar with these concepts. Is this a trick in preparation
for this:
+
+ if (graph_commit->generation != max_generation + 1)
Any way that could give a false negative? (I'm not sure it would matter
much.) Maybe "if (!MAX && generation != max + 1)".
+ graph_report("commit-graph has incorrect generation
for %s",
+ oid_to_hex(&cur_oid));
+ } else {
+ graph_parents = graph_commit->parents;
+
+ while (graph_parents) {
+ if (graph_parents->item->generation)
+ graph_report("commit-graph has generation
ZERO for %s but not its parent, %s",
+ oid_to_hex(&cur_oid),
+
oid_to_hex(&graph_parents->item->object.oid));
+ graph_parents = graph_parents->next;
+ }
+ }
+ }
+
return verify_commit_graph_error;
}
At this point, I should admit that I went through the above thinking
"right, makes sense, ok, sure". I was not really going "hmm, I wonder
..." This looks like the real meat of "verify", and I'll try to look it
over with a fresh pair of eyes tomorrow.