Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jeff King
On Wed, Oct 10, 2018 at 10:59:45PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Callers who _are_ prepared to act on the exit code probably ought to
> > just use --auto-exit-code in their invocation.
> >
> > That said, I'm not entirely opposed to the matching config. There's
> > enough history here that somebody might want a sledgehammer setting to
> > go back to the old behavior.
> 
> If it's not a config option then as git is upgraded I'll need to change
> my across-server invocation to be some variant of checking git version,
> then etiher using the --auto-exit-code option or not (which'll error on
> older gits). Easier to be able to just drop in a config setting before
> the upgrade.

Yeah, that's the "there's enough history here" that I was referring to,
but I hadn't quite thought through a concrete example. That makes sense.

(Though I also think the other part of the thread is reasonable, too,
where we'd just have a command to abstract away "cat .git/gc.log" into
"git gc --show-detached-log" or something).

-Peff


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> Which is what I'm doing by running "gc --auto" across a set of servers
> and looking at the exit code. If it's been failing I get an error, if
> there's no need to gc nothing happens, and if it hasn't been failing and
> it just so happens that it's time to GC then fine, now was as good a
> time as any.

For this, a simple "git gc --detached-status" would work.  It sounds
like the bonus gc --auto run was a side effect instead of being a
requirement.

> So if we assume that for the sake of argument there's no point in a
> --detached-status either. My only reason for ever caring about that
> status is when I run "gc --auto" and it says it can't fork() itself so
> it fails. Since I'm using "gc --auto" I have zero reason to even ask
> that question unless I'm OK with kicking off a gc run as a side-effect,
> so why split up the two? It just introduces a race condition for no
> benefit.

What I am trying to do is design an interface which is simple to
explain in the manual.  The existing "git gc --auto" interface since
detaching was introduced is super confusing to me, so I'm going
through the thought exercise of "If we were starting over, what would
we build instead?"

Part of the answer to that question might include a

--report-from-the-last-time-you-detached

option.  I'm still failing to come up of a case where the answer to
that question would include a

--report-from-the-last-time-you-detached-and-if-it-went-okay\
-then-run-another-detached-gc

option.

In other words, I think our disconnect is that you are describing
things in terms of "I have been happy with the existing git gc --auto
detaching behavior, so how can we maintain something as close as
possible to that"?  And I am trying to describe things in terms of
"What is the simplest, most maintainable, and easiest to explain way
to keep Ævar's servers working well"?

Jonathan


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Ævar Arnfjörð Bjarmason


On Wed, Oct 10 2018, Jonathan Nieder wrote:

> Junio C Hamano wrote:
>> Jonathan Nieder  writes:
>
>>> Perhaps this reporting could also print the message from a previous
>>> run, so you could write:
>>>
>>> git gc --detached-status || exit
>>> git gc --auto; # perhaps also passing --detach
>>>
>>> (Names still open for bikeshedding.)
>>
>> When the command is given --detached-exit-code/status option, what
>> does it do?  Does it perform the "did an earlier run left gc.log?"
>> and report the result and nothing else?  In other words, is it a
>> pure replacement for "test -e .git/gc.log"?
>
> My intent was the latter.  In other words, in the idiom
>
>   do_something_async &
>   ... a lot of time passes ...
>   wait
>
> it is something like the replacement for "wait".
>
> More precisely,
>
>   git gc --detached-status || exit
>
> would mean something like
>
>   if test -e .git/gc.log  # Error from previous gc --detach?
>   then
>   cat >&2 .git/gc.log # Report the error.
>   exit 1
>   fi
>
>>  Or does it do some of
>> the "auto-gc" prep logic like guestimating loose object count and
>> have that also in its exit status (e.g. "from the gc.log left
>> behind, we know that we failed to reduce loose object count down
>> sufficiently after finding there are more than 6700 earlier, but now
>> we do not have that many loose object, so there is nothing to
>> complain about the presence of gc.log")?
>
> Depending on the use case, a user might want to avoid losing
> information about the results of a previous "git gc --detach" run,
> even if they no longer apply.  For example, a user might want to
> collect the error message for monitoring or later log analysis, to
> track down intermittent gc errors that go away on their own.
>
> A separate possible use case might be a
>
>   git gc --needs-auto-gc
>
> command that detects whether an auto gc is needed.  With that, a
> caller that only wants to learn about errors if auto gc is needed
> could run
>
>   if git gc --needs-auto-gc
>   then
>   git gc --detached-status || exit
>   fi
>
>> I am bad at naming myself, but worse at guessing what others meant
>> with a new thing that was given a new name whose name is fuzzy,
>> so... ;-)
>
> No problem.  I'm mostly trying to tease out more details about the use
> case.

Likewise, so don't take the following as an assertion of fact, but more
of a fact-finding mission:

We could add something like this --detached-status / --needs-auto-gc,
but I don't need it, and frankly I can't think of a reason for why
anyone would want to use these.

The entire point of having gc --auto in the first place is that you
don't care when exactly GC happens, you're happy with whenever git
decides it's needed.

So why would anyone need a --needs-auto-gc? If your criteria for doing
GC exactly matches that of gc --auto then ... you just run gc --auto, if
it isn't (e.g. if you're using Microsoft's Windows repo) you're not
using gc --auto in the first place, and neither --needs-auto-gc nor
--auto is useful to you.

So maybe I'm missing something here, but a --needs-auto-gc just seems
like a gratuitous exposure of an internal implementation detail whose
only actionable result is doing what we're doing with "gc --auto" now,
i.e. just run gc.

Which is what I'm doing by running "gc --auto" across a set of servers
and looking at the exit code. If it's been failing I get an error, if
there's no need to gc nothing happens, and if it hasn't been failing and
it just so happens that it's time to GC then fine, now was as good a
time as any.

So if we assume that for the sake of argument there's no point in a
--detached-status either. My only reason for ever caring about that
status is when I run "gc --auto" and it says it can't fork() itself so
it fails. Since I'm using "gc --auto" I have zero reason to even ask
that question unless I'm OK with kicking off a gc run as a side-effect,
so why split up the two? It just introduces a race condition for no
benefit.


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:

>> Perhaps this reporting could also print the message from a previous
>> run, so you could write:
>>
>>  git gc --detached-status || exit
>>  git gc --auto; # perhaps also passing --detach
>>
>> (Names still open for bikeshedding.)
>
> When the command is given --detached-exit-code/status option, what
> does it do?  Does it perform the "did an earlier run left gc.log?"
> and report the result and nothing else?  In other words, is it a
> pure replacement for "test -e .git/gc.log"?

My intent was the latter.  In other words, in the idiom

do_something_async &
... a lot of time passes ...
wait

it is something like the replacement for "wait".

More precisely,

git gc --detached-status || exit

would mean something like

if test -e .git/gc.log  # Error from previous gc --detach?
then
cat >&2 .git/gc.log # Report the error.
exit 1
fi

>  Or does it do some of
> the "auto-gc" prep logic like guestimating loose object count and
> have that also in its exit status (e.g. "from the gc.log left
> behind, we know that we failed to reduce loose object count down
> sufficiently after finding there are more than 6700 earlier, but now
> we do not have that many loose object, so there is nothing to
> complain about the presence of gc.log")?

Depending on the use case, a user might want to avoid losing
information about the results of a previous "git gc --detach" run,
even if they no longer apply.  For example, a user might want to
collect the error message for monitoring or later log analysis, to
track down intermittent gc errors that go away on their own.

A separate possible use case might be a

git gc --needs-auto-gc

command that detects whether an auto gc is needed.  With that, a
caller that only wants to learn about errors if auto gc is needed
could run

if git gc --needs-auto-gc
then
git gc --detached-status || exit
fi

> I am bad at naming myself, but worse at guessing what others meant
> with a new thing that was given a new name whose name is fuzzy,
> so... ;-)

No problem.  I'm mostly trying to tease out more details about the use
case.

Thanks,
Jonathan


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Junio C Hamano
Jonathan Nieder  writes:

> Perhaps this reporting could also print the message from a previous
> run, so you could write:
>
>   git gc --detached-status || exit
>   git gc --auto; # perhaps also passing --detach
>
> (Names still open for bikeshedding.)

When the command is given --detached-exit-code/status option, what
does it do?  Does it perform the "did an earlier run left gc.log?"
and report the result and nothing else?  In other words, is it a
pure replacement for "test -e .git/gc.log"?  Or does it do some of
the "auto-gc" prep logic like guestimating loose object count and
have that also in its exit status (e.g. "from the gc.log left
behind, we know that we failed to reduce loose object count down
sufficiently after finding there are more than 6700 earlier, but now
we do not have that many loose object, so there is nothing to
complain about the presence of gc.log")?

I am bad at naming myself, but worse at guessing what others meant
with a new thing that was given a new name whose name is fuzzy,
so... ;-)


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> Right. I know. What I mean is now I can (and do) use it to run 'git gc
> --auto' across our server fleet and see whether I have any of #3, or
> whether it's all #1 or #2. If there's nothing to do in #1 that's fine,
> and it just so happens that I'll run gc due to #2 that's also fine, but
> I'd like to see if gc really is stuck.
>
> This of course relies on them having other users / scripts doing normal
> git commands which would trigger previous 'git gc --auto' runs.
>
> I.e. with your change that command:
>
> git gc --auto
>
> Would change to something like:
>
> git gc --auto && ! test -e .git/gc.log
>
> Which, as noted is a bit of a nasty breaker of the encapsulation

That helps.  What if we package up the "test -e .git/gc.log" bit
*without* having the side effect of running git gc --auto, so that you
can run

if ! git gc --detached-exit-code
then
... handle the error ...
fi
git gc --auto; # perhaps also with --detach

?

I'm not great at naming options, so the --detached-exit-code name is
bikesheddable.  What I really mean to ask about is: what if the status
reporting goes in a separate command from running gc --auto?

Perhaps this reporting could also print the message from a previous
run, so you could write:

git gc --detached-status || exit
git gc --auto; # perhaps also passing --detach

(Names still open for bikeshedding.)

Thanks and hope that helps,
Jonathan


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Ævar Arnfjörð Bjarmason


On Wed, Oct 10 2018, Jonathan Nieder wrote:

> Hi,
>
> Ævar Arnfjörð Bjarmason wrote:
>
>> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode'
>> configuration option to optionally bring back the 'git gc --auto' exit
>> code behavior as it existed between 2.6.3..2.19.0 (inclusive).
>
> Hm.  Can you tell me more about the use case where this would be
> helpful to you?  That would help us come up with a better name for it.

>From the E-Mail linked from the commit message[1] (I opted not to put
this in, because it was getting a bit long:

Right. I know. What I mean is now I can (and do) use it to run 'git gc
--auto' across our server fleet and see whether I have any of #3, or
whether it's all #1 or #2. If there's nothing to do in #1 that's fine,
and it just so happens that I'll run gc due to #2 that's also fine, but
I'd like to see if gc really is stuck.

This of course relies on them having other users / scripts doing normal
git commands which would trigger previous 'git gc --auto' runs.

I.e. with your change that command:

git gc --auto

Would change to something like:

git gc --auto && ! test -e .git/gc.log

Which, as noted is a bit of a nasty breaker of the encapsulation, so
now:

git gc --auto --auto-exit-code

Or just a variant of that which will have dropped the config in-place in
/etc/gitconfig, and then as before:

git gc --auto

1. https://public-inbox.org/git/878t69dgvx@evledraar.gmail.com/


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Ævar Arnfjörð Bjarmason


On Wed, Oct 10 2018, Jeff King wrote:

> On Wed, Oct 10, 2018 at 07:27:32PM +, Ævar Arnfjörð Bjarmason wrote:
>
>> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode'
>> configuration option to optionally bring back the 'git gc --auto' exit
>> code behavior as it existed between 2.6.3..2.19.0 (inclusive).
>>
>> This was changed in 3029970275 ("gc: do not return error for prior
>> errors in daemonized mode", 2018-07-16). The motivation for that patch
>> was to appease 3rd party tools whose treatment of the 'git gc --auto'
>> exit code is different from that of git core where it has always been
>> ignored.
>
> OK. I wouldn't want to use this myself, but I think you've made clear
> why you find it useful. So I don't mind making it an optional behavior
> (and it probably beats you trying to poke at the logfile yourself).

[...]

> I'm not sure if the config is going to actually help that much, though.
> The callers within Git will generally ignore the exit code anyway. So
> for those cases, setting it will at best do nothing, and at worst it may
> confuse the few stragglers (e.g., the git-svn one under recent
> discussion).

Yeah git internals don't care, but we've never advertised the
combination of --auto and gc.autoDetach=true as being something
internal-only, so e.g. I wrote stuff expecting errors, and one might run
"git gc --auto" in a repo whose .git/objects state is uncertain to see
if it needed repack (and have a shell integration that reports
failures...).

> Callers who _are_ prepared to act on the exit code probably ought to
> just use --auto-exit-code in their invocation.
>
> That said, I'm not entirely opposed to the matching config. There's
> enough history here that somebody might want a sledgehammer setting to
> go back to the old behavior.

If it's not a config option then as git is upgraded I'll need to change
my across-server invocation to be some variant of checking git version,
then etiher using the --auto-exit-code option or not (which'll error on
older gits). Easier to be able to just drop in a config setting before
the upgrade.


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jonathan Nieder
Hi,

Ævar Arnfjörð Bjarmason wrote:

> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode'
> configuration option to optionally bring back the 'git gc --auto' exit
> code behavior as it existed between 2.6.3..2.19.0 (inclusive).

Hm.  Can you tell me more about the use case where this would be
helpful to you?  That would help us come up with a better name for it.

Thanks,
Jonathan


Re: [PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Jeff King
On Wed, Oct 10, 2018 at 07:27:32PM +, Ævar Arnfjörð Bjarmason wrote:

> Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode'
> configuration option to optionally bring back the 'git gc --auto' exit
> code behavior as it existed between 2.6.3..2.19.0 (inclusive).
> 
> This was changed in 3029970275 ("gc: do not return error for prior
> errors in daemonized mode", 2018-07-16). The motivation for that patch
> was to appease 3rd party tools whose treatment of the 'git gc --auto'
> exit code is different from that of git core where it has always been
> ignored.

OK. I wouldn't want to use this myself, but I think you've made clear
why you find it useful. So I don't mind making it an optional behavior
(and it probably beats you trying to poke at the logfile yourself).

I'm not sure if the config is going to actually help that much, though.
The callers within Git will generally ignore the exit code anyway. So
for those cases, setting it will at best do nothing, and at worst it may
confuse the few stragglers (e.g., the git-svn one under recent
discussion).

Callers who _are_ prepared to act on the exit code probably ought to
just use --auto-exit-code in their invocation.

That said, I'm not entirely opposed to the matching config. There's
enough history here that somebody might want a sledgehammer setting to
go back to the old behavior.

-Peff


[PATCH] gc: introduce an --auto-exit-code option for undoing 3029970275

2018-10-10 Thread Ævar Arnfjörð Bjarmason
Add an --auto-exit-code variable and a corresponding 'gc.autoExitCode'
configuration option to optionally bring back the 'git gc --auto' exit
code behavior as it existed between 2.6.3..2.19.0 (inclusive).

This was changed in 3029970275 ("gc: do not return error for prior
errors in daemonized mode", 2018-07-16). The motivation for that patch
was to appease 3rd party tools whose treatment of the 'git gc --auto'
exit code is different from that of git core where it has always been
ignored.

That means that out of the three modes gc --auto will operate in:

 1. gc --auto has nothing to do
 2. gc --auto has something to do, will fork and try to do it
 3. gc --auto has something to do, but notices that gc has been failing
before when forked and can't do anything now.

We started exiting with zero in the case of #3, instead of
non-zero (see [1] for more details). As noted by the docs being added
here the #3 case is relatively rare, so I think it's fine to change
this as the default with the assumption that the use-case for tools
like the "repo" tool noted in commit 3029970275 above are more common
than not.

But it left us without any option to have "git gc --auto" tell us
about this failure except by starting to either parse its output, or
for the caller to start breaking the encapsulation and starting to
check for .git/gc.log themselves.

Let's instead provide an option to exit with non-zero when we have
errors to tell the user about, and provide a configuration option so
that it can be dropped in-place in anticipation of upgrading to Git
version 2.20 without having to make using --auto-exit-code conditional
on the git version in use.

1. https://public-inbox.org/git/878t69dgvx@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

> On Wed, Oct 10, 2018 at 09:51:52AM -0700, Jonathan Nieder wrote:
>
>> Ævar Arnfjörð Bjarmason wrote:
>> 
>> > I'm just saying it's hard in this case to remove one piece without the
>> > whole Jenga tower collapsing, and it's probably a good idea in some of
>> > these cases to pester the user about what he wants, but probably not via
>> > gc --auto emitting the same warning every time, e.g. in one of these
>> > threads I suggested maybe "git status" should emit this.
>> 
>> I have to say, I don't have a lot of sympathy for this.
>> 
>> I've been running with the patches I sent before for a while now, and
>> the behavior that they create is great.  I think we can make further
>> refinements on top.  To put it another way, I haven't actually
>> experienced any bad knock-on effects, and I think other feature
>> requests can be addressed separately.
>
> I think there may be some miscommunication here. The Jenga tower above
> is referring (I think) to Jonathan Tan's original patch to drop the
> warning entirely, which does have some unwanted side effects.
>
> Your patches are much less controversial, I think, and are in next and
> marked as "will merge to master" in the last "what's cooking".

[Junio: This goes on top of gitster/jn/gc-auto]

I thought the jn/gc-auto topic was still in "pu", my fault for not
paying attention. It seems the general consensus is against my notion
of what should be the default (which is fine), but since (as noted in
the patch) it seems yucky to start breaking the encapsulation of
gc.log, especially as it's looking more and more likely that it'll be
an implementation detail we might drop, here's a patch on top of
jn/gc-auto to optionally restore the old behavior.

 Documentation/config.txt | 28 
 Documentation/git-gc.txt |  7 +++
 builtin/gc.c |  7 ++-
 t/t6500-gc.sh|  2 ++
 4 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5b72684999..e37a463bf8 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1635,6 +1635,34 @@ gc.autoDetach::
Make `git gc --auto` return immediately and run in background
if the system supports it. Default is true.
 
+gc.autoExitCode::
+   Make `git gc --auto` return non-zero if it would have
+   demonized itself (see `gc.autoDetach`) due to a needed gc, but
+   a 'gc.log' is found within the `gc.logExpiry` with an error
+   from a previous run.
++
+When 'git gc' is run with the default of `gc.autoDetach=true` a
+failure might have been noted in the 'gc.log' from a previously
+detached `--auto` run. If the failure is within the time configured in
+`gc.logExpiry` the `--auto` run will abort early and report the error
+in the 'gc.log'.
++
+From version 2.6.3 to version 2.19 of Git encountering this error
+would cause 'git gc' to exit with non-zero, but this was deemed to be
+a hassle for third-party tools to handle since it rarely happens, and
+they usually don't assume that 'git gc --auto' can fail. Therefore
+since version 2.20 of Git 'git gc --auto' will always exit with zero
+if it would have demonized itself, even when encountering suc