On Wed, Oct 10 2018, SZEDER Gábor wrote:
> On Mon, Sep 17, 2018 at 03:33:35PM +0000, Ævar Arnfjörð Bjarmason wrote:
>> $ git -c gc.writeCommitGraph=true gc
>> [...]
>> Annotating commits in commit graph: 1565573, done.
>> Computing commit graph generation numbers: 100% (782484/782484), done.
>
> While poking around 'commit-graph.c' in my Bloom filter experiment, I
> saw similar numbers like above, and was confused by the much higher
> than expected number of annotated commits. It's about twice as much
> as the number of commits in the repository, or the number shown on the
> very next line.
>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 8a1bec7b8a..2c5d996194 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> -static void close_reachable(struct packed_oid_list *oids)
>> +static void close_reachable(struct packed_oid_list *oids, int
>> report_progress)
>> {
>> int i;
>> struct commit *commit;
>> + struct progress *progress = NULL;
>> + int j = 0;
>>
>> + if (report_progress)
>> + progress = start_delayed_progress(
>> + _("Annotating commits in commit graph"), 0);
>> for (i = 0; i < oids->nr; i++) {
>> + display_progress(progress, ++j);
>> commit = lookup_commit(the_repository, &oids->list[i]);
>> if (commit)
>> commit->object.flags |= UNINTERESTING;
>> @@ -604,6 +616,7 @@ static void close_reachable(struct packed_oid_list *oids)
>> * closure.
>> */
>> for (i = 0; i < oids->nr; i++) {
>> + display_progress(progress, ++j);
>> commit = lookup_commit(the_repository, &oids->list[i]);
>>
>> if (commit && !parse_commit(commit))
>> @@ -611,19 +624,28 @@ static void close_reachable(struct packed_oid_list
>> *oids)
>> }
>
> The above loops have already counted all the commits, and, more
> importantly, did all the hard work that takes time and makes the
> progress indicator useful.
>
>> for (i = 0; i < oids->nr; i++) {
>> + display_progress(progress, ++j);
[...]
> This display_progress() call, however, doesn't seem to be necessary.
> First, it counts all commits for a second time, resulting in the ~2x
> difference compared to the actual number of commits, and then causing
> my confusion. Second, all what this loop is doing is setting a flag
> in commits that were already looked up and parsed in the above loops.
> IOW this loop is very fast, and the progress indicator jumps from
> ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a
> progress indicator at all.
You're right, I tried this patch on top:
diff --git a/commit-graph.c b/commit-graph.c
index a686758603..cccd83de72 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -655,12 +655,16 @@ static void close_reachable(struct packed_oid_list
*oids, int report_progress)
if (commit)
commit->object.flags |= UNINTERESTING;
}
+ stop_progress(&progress); j = 0;
/*
* As this loop runs, oids->nr may grow, but not more
* than the number of missing commits in the reachable
* closure.
*/
+ if (report_progress)
+ progress = start_delayed_progress(
+ _("Annotating commits in commit graph 2"), 0);
for (i = 0; i < oids->nr; i++) {
display_progress(progress, ++j);
commit = lookup_commit(the_repository, &oids->list[i]);
@@ -668,7 +672,11 @@ static void close_reachable(struct packed_oid_list
*oids, int report_progress)
if (commit && !parse_commit(commit))
add_missing_parents(oids, commit);
}
+ stop_progress(&progress); j = 0;
+ if (report_progress)
+ progress = start_delayed_progress(
+ _("Annotating commits in commit graph 3"), 0);
for (i = 0; i < oids->nr; i++) {
display_progress(progress, ++j);
commit = lookup_commit(the_repository, &oids->list[i]);
And on a large repo with around 3 million commits the 3rd progress bar
doesn't kick in.
But if I apply this on top:
diff --git a/progress.c b/progress.c
index 5a99c9fbf0..89cc705bf7 100644
--- a/progress.c
+++ b/progress.c
@@ -58,8 +58,8 @@ static void set_progress_signal(void)
sa.sa_flags = SA_RESTART;
sigaction(SIGALRM, &sa, NULL);
- v.it_interval.tv_sec = 1;
- v.it_interval.tv_usec = 0;
+ v.it_interval.tv_sec = 0;
+ v.it_interval.tv_usec = 250000;
v.it_value = v.it_interval;
setitimer(ITIMER_REAL, &v, NULL);
}
I.e. start the timer after 1/4 of a second instead of 1 second, I get
that progress bar.
So I'm inclined to keep it. It just needs to be 4x the size before it's
noticeably hanging for 1 second.
That repo isn't all that big compared to what we've heard about out
there, and inner loops like this have a tendency to accumulate some more
code over time without a re-visit of why we weren't monitoring progress
there.
But maybe we can fix the message. We say "Annotating commits in commit
grap", not "Counting" or whatever. I was trying to find something that
didn't imply that we were doing this once. One can annotate a thing more
than once, but maybe ther's a better way to explain this...
We had some more accurate progress reporting in close_reachable(),
discussed in
https://public-inbox.org/git/[email protected]/ I still
think the *main* use-case for these things is to just report that we're
not hanging, so maybe the proper solution is to pick up Duy's patch to
display a spinner insted of a numeric progress.
>> commit = lookup_commit(the_repository, &oids->list[i]);
>>
>> if (commit)
>> commit->object.flags &= ~UNINTERESTING;
>> }
>> + stop_progress(&progress);
>> }