Before this change the "git reflog expire" command didn't report any
progress. This is the second command (after "pack-refs --all --prune")
that the "gc" command will run.

On small repositories like this command won't take long to run, my
test system it takes just under 1 second to run on git.git, but just
around 8 seconds on linux.git, and much longer on the
2015-04-03-1M-git.git[1] large test repository.

Taking so long means that "gc" will appear to hang at the beginning of
its run. It might still do so after this change if the earlier
"pack-refs" command takes a really long time to run, but that'll only
impact repositories with a really large set of refs to pack, and can
be addressed in some future change.

One thing that's bad about this change is that we might *in theory*
print a "Marking unreachable commits in reflog for expiry" message for
each ref with a reflog. This is because the abbreviated callstack
looks like this:

    0  mark_reachable at builtin/reflog.c:227
    1  in unreachable at builtin/reflog.c:290
    2  in should_expire_reflog_ent at builtin/reflog.c:317
    3  in expire_reflog_ent at refs/files-backend.c:2956
    4  in show_one_reflog_ent at refs/files-backend.c:1879
    # This is the last function that has the refname (e.g. "HEAD") available
    5  in files_for_each_reflog_ent at refs/files-backend.c:2025
    6  in refs_for_each_reflog_ent at refs.c:2066
    7  in files_reflog_expire at refs/files-backend.c:3043
    8  in refs_reflog_expire at refs.c:2117
    9  in reflog_expire at refs.c:2129
    # Here's where we collect reflogs to expire, and expire each one
    10 in cmd_reflog_expire  at builtin/reflog.c:595

I.e. this progress is being reported for each expired reflog. So if
start_progress() were used instead of start_delayed_progress() we'd
print (e.g. on my git.git) hundreds of these lines.

In practice I haven't been able to make it print anything except one
line. This is because validating the reflogs for these other
branches (not "HEAD") takes such a short amount of time.

That may just be some artifact of the repositories I've tested, but I
suspect It'll be true in general. As the callstack above shows, in
order to guarantee that we don't do that we'd need to pass some
"progress" variable through 10 levels of functions, many of which are
"for_each" callback functions with void* cb_data.

1. https://github.com/avar/2015-04-03-1M-git

Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
---
 builtin/reflog.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acef5a0ab..d3075ee75a 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -10,6 +10,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "reachable.h"
+#include "progress.h"
 
 /* NEEDSWORK: switch to using parse_options */
 static const char reflog_expire_usage[] =
@@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb 
*cb)
        struct commit_list *pending;
        timestamp_t expire_limit = cb->mark_limit;
        struct commit_list *leftover = NULL;
+       struct progress *progress = NULL;
+       int i = 0;
 
        for (pending = cb->mark_list; pending; pending = pending->next)
                pending->item->object.flags &= ~REACHABLE;
 
        pending = cb->mark_list;
+       progress = start_delayed_progress(
+               _("Marking unreachable commits in reflog for expiry"), 0);
        while (pending) {
                struct commit_list *parent;
                struct commit *commit = pop_commit(&pending);
+
+               display_progress(progress, ++i);
                if (commit->object.flags & REACHABLE)
                        continue;
                if (parse_commit(commit))
@@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb 
*cb)
                }
        }
        cb->mark_list = leftover;
+       stop_progress(&progress);
 }
 
 static int unreachable(struct expire_reflog_policy_cb *cb, struct commit 
*commit, struct object_id *oid)
-- 
2.19.0.444.g18242da7ef

Reply via email to