On Wed, Oct 10 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <ava...@gmail.com> writes:
>
>>  - We use this warning as a proxy for "let's not run for a day",
>>    otherwise we'll just grind on gc --auto trying to consolidate
>>    possibly many hundreds of K of loose objects only to find none of
>>    them can be pruned because the run into the expiry policy. With the
>>    warning we retry that once per day, which sucks less.
>>
>>  - This conflation of the user-visible warning and the policy is an
>>    emergent effect of how the different gc pieces interact, which as I
>>    note in the linked thread(s) sucks.
>>
>>    But we can't just yank one piece away (as Jonathan's patch does)
>>    without throwing the baby out with the bathwater.
>>
>>    It will mean that e.g. if you have 10k loose objects in your git.git,
>>    and created them just now, that every time you run anything that runs
>>    "gc --auto" we'll fork to the background, peg a core at 100% CPU for
>>    2-3 minutes or whatever it is, only do get nowhere and do the same
>>    thing again in ~3 minutes when you run your next command.
>
> We probably can keep the "let's not run for a day" safety while
> pretending that "git gc -auto" succeeded for callers like "git svn"
> so that these callers do not hae to do "eval { ... }" to hide our
> exit code, no?
>
> I think that is what Jonathan's patch (jn/gc-auto) does.

Yeah we could take that patch to skip the eval {} suggested upthread.

As noted when it was discussed I'm *mildly* negative on hiding a IMO
meaningful exit code like that, but maybe sprinkling eval {} or other
"run but ignore exit code" in stuff running "gc --auto" is worth it, and
we could just document that you may want to check gc.log.

> From: Jonathan Nieder <jrnie...@gmail.com>
> Date: Mon, 16 Jul 2018 23:57:40 -0700
> Subject: [PATCH] gc: do not return error for prior errors in daemonized mode
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 95c8afd07b..ce8a663a01 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -438,9 +438,15 @@ static const char *lock_repo_for_gc(int force, pid_t* 
> ret_pid)
>       return NULL;
>  }
>
> -static void report_last_gc_error(void)
> +/*
> + * Returns 0 if there was no previous error and gc can proceed, 1 if
> + * gc should not proceed due to an error in the last run. Prints a
> + * message and returns -1 if an error occured while reading gc.log
> + */
> +static int report_last_gc_error(void)
>  {
>       struct strbuf sb = STRBUF_INIT;
> +     int ret = 0;
> ...
>       if (len < 0)
> +             ret = error_errno(_("cannot read '%s'"), gc_log_path);
> +     else if (len > 0) {
> +             /*
> +              * A previous gc failed.  Report the error, and don't
> +              * bother with an automatic gc run since it is likely
> +              * to fail in the same way.
> +              */
> +             warning(_("The last gc run reported the following. "
>                              "Please correct the root cause\n"
>                              "and remove %s.\n"
>                              "Automatic cleanup will not be performed "
>                              "until the file is removed.\n\n"
>                              "%s"),
>                           gc_log_path, sb.buf);
> +             ret = 1;
> +     }
>       strbuf_release(&sb);
>  done:
>       free(gc_log_path);
> +     return ret;
>  }
>
> I.e. report_last_gc_error() returns 1 when finds that the previous
> attempt to "gc --auto" failed.  And then
>
> @@ -561,7 +576,13 @@ int cmd_gc(int argc, const char **argv, const char 
> *prefix)
>                       fprintf(stderr, _("See \"git help gc\" for manual 
> housekeeping.\n"));
>               }
>               if (detach_auto) {
> -                     report_last_gc_error(); /* dies on error */
> +                     int ret = report_last_gc_error();
> +                     if (ret < 0)
> +                             /* an I/O error occured, already reported */
> +                             exit(128);
> +                     if (ret == 1)
> +                             /* Last gc --auto failed. Skip this one. */
> +                             return 0;
>
> ... it exits with 0 without bothering to rerun "gc".
>
> So it won't get stuck for 3 minutes; the repository after "gc
> --auto" punts will stay to be suboptimal for a day, and the user
> kill not get an "actionable" error notice (due to this hiding of
> previous error), hence cannot make changes that may help like
> shortening expiry period, though.

Right, because it still writes the gc.log, but we'll still be yelling at
the user on every commit/fetch etc. that we discovered such-and-such an
issue on the last gc for that full day.

That 3 minute comment was in reference to if we'd apply Jonathan Tan's
"[PATCH] gc: do not warn about too many loose objects without any other
changes. Then we'd just keep returning true on too_many_loose_objects()
even though gc wouldn't help to resolve it.

Reply via email to