Re: [PATCH] builtin/describe: introduce --broken flag

2017-03-21 Thread Stefan Beller
On Tue, Mar 21, 2017 at 3:41 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> AFAICT, someone is (was?) using a version of Git that doesn't contain
>> f8eaa0ba98 (submodule--helper, module_clone: always operate
>> on absolute paths, 2016-03-31). So then the submodule paths were
>> made absolute paths on creation of the Gerrit repo.
>>
>> And then someone moved the repo and the absolute paths broke.
>>
>> Even after an upgrade of Git to its latest and greatest version, the 
>> underlying
>> issue of having broken submodule paths remains in that case.
>>
>> So there are a couple of ways forward
>> 0) as an immediate fix, manually fix the absolute path or make them relative
>>
>> 1A) have more error resilient tools in Git
>> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
>> the .git file link and the core.worktree setting to be relative and 
>> correct.
>>
>> I think we should do both A and B, I decided to go with A first, specifically
>> "git-describe" as that was reported to not work well in this situation with 
>> the
>> given broken data
>
> Sounds sensible.  I would say that we should do less 1A (i.e. hiding
> problems under the rug) and more 1B.  Of course, making the problem
> less likely to happen in the first place would be more important ;-)

Agreed on not introducing bugs as often. For (B), I wonder what the
right place is.

Maybe "reset --hard", in combination with "--recurse-submodules" would also
fix these submodule path issues. The made up "git submodule fsck-setup"
sounds like it would only detect (and not fix) the problems. But these absolute
paths issues are known how to fix, so a "fsck"-y tool may convey a wrong
impression.


Re: [PATCH] builtin/describe: introduce --broken flag

2017-03-21 Thread Junio C Hamano
Stefan Beller  writes:

> AFAICT, someone is (was?) using a version of Git that doesn't contain
> f8eaa0ba98 (submodule--helper, module_clone: always operate
> on absolute paths, 2016-03-31). So then the submodule paths were
> made absolute paths on creation of the Gerrit repo.
>
> And then someone moved the repo and the absolute paths broke.
>
> Even after an upgrade of Git to its latest and greatest version, the 
> underlying
> issue of having broken submodule paths remains in that case.
>
> So there are a couple of ways forward
> 0) as an immediate fix, manually fix the absolute path or make them relative
>
> 1A) have more error resilient tools in Git
> 1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
> the .git file link and the core.worktree setting to be relative and 
> correct.
>
> I think we should do both A and B, I decided to go with A first, specifically
> "git-describe" as that was reported to not work well in this situation with 
> the
> given broken data

Sounds sensible.  I would say that we should do less 1A (i.e. hiding
problems under the rug) and more 1B.  Of course, making the problem
less likely to happen in the first place would be more important ;-)

Thanks.


Re: [PATCH] builtin/describe: introduce --broken flag

2017-03-21 Thread Stefan Beller
On Tue, Mar 21, 2017 at 2:51 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> This patch helps to fix the root cause in [1], which tries to work around
>> this situation.
>
> I do not necessarily think it is reasonable to give $version-dirty
> and proceed when a repository corruption is detected; if there is a
> breakage in the repository, "git describe" is correct to die when a
> populated submodule is broken.  IOW, I do not agree that [1] below
> is working with a sensible expectation.

ok, so I won't quote it in the commit message

>
> This is a tangent, but how does the Gerrit repository get corrupted
> in the way described in [1] in the first place?  That might be what
> needs to be corrected, perhaps?

AFAICT, someone is (was?) using a version of Git that doesn't contain
f8eaa0ba98 (submodule--helper, module_clone: always operate
on absolute paths, 2016-03-31). So then the submodule paths were
made absolute paths on creation of the Gerrit repo.

And then someone moved the repo and the absolute paths broke.

Even after an upgrade of Git to its latest and greatest version, the underlying
issue of having broken submodule paths remains in that case.

So there are a couple of ways forward
0) as an immediate fix, manually fix the absolute path or make them relative

1A) have more error resilient tools in Git
1B) have a tool in git (e.g. "git submodule fsck-setup") that rewrites
the .git file link and the core.worktree setting to be relative and correct.

I think we should do both A and B, I decided to go with A first, specifically
"git-describe" as that was reported to not work well in this situation with the
given broken data

> The wording for the "--dirty" is already awkward, but this one is
> even more so ("Describe the working tree. It means" conveys no
> useful information).  I however cannot come up with something much
> better.  This is the best I could come up with:
>
> Describe the state of the working tree.  When the working
> tree matches HEAD, the output is the same as "git describe
> HEAD" and "-dirty" is appended to it if the working tree has
> local modification.  When a repository is corrupt and Git
> cannot determine if there is local modification, instead of
> dying, append "-broken" instead.

ok, I'll reuse parts of it.

>> +static const char *append, *dirty, *broken;
>
> Perhaps call it "suffix" or something?

done


>> + argv_array_pushl(&args, "diff-index", "--quiet", 
>> "HEAD", "--", NULL);
..
>  Wouldn't argv_array_pushv() into these two different args
> array from the same template work better?

yes, looks much saner. Thanks


Re: [PATCH] builtin/describe: introduce --broken flag

2017-03-21 Thread Junio C Hamano
Stefan Beller  writes:

> This patch helps to fix the root cause in [1], which tries to work around
> this situation.

I do not necessarily think it is reasonable to give $version-dirty
and proceed when a repository corruption is detected; if there is a
breakage in the repository, "git describe" is correct to die when a
populated submodule is broken.  IOW, I do not agree that [1] below
is working with a sensible expectation.

This is a tangent, but how does the Gerrit repository get corrupted
in the way described in [1] in the first place?  That might be what
needs to be corrected, perhaps?

>  Documentation/git-describe.txt |  7 +
>  builtin/describe.c | 59 
> ++
>  t/t6120-describe.sh| 20 ++
>  3 files changed, 70 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt
> index 8755f3af7b..b71fa7a4ad 100644
> --- a/Documentation/git-describe.txt
> +++ b/Documentation/git-describe.txt
> @@ -34,6 +34,13 @@ OPTIONS
>   It means describe HEAD and appends  (`-dirty` by
>   default) if the working tree is dirty.
>  
> +--broken[=]::
> + Describe the working tree.
> + It means describe HEAD and appends  (`-broken` by
> + default) if the working tree cannot be examined for dirtiness.
> + This implies `--dirty`, which is the fallback behavior when
> + the working tree examination works.

The wording for the "--dirty" is already awkward, but this one is
even more so ("Describe the working tree. It means" conveys no
useful information).  I however cannot come up with something much
better.  This is the best I could come up with:

Describe the state of the working tree.  When the working
tree matches HEAD, the output is the same as "git describe
HEAD" and "-dirty" is appended to it if the working tree has
local modification.  When a repository is corrupt and Git
cannot determine if there is local modification, instead of
dying, append "-broken" instead.

The last sentence can be tweaked to replace the description for
"--dirty".

> diff --git a/builtin/describe.c b/builtin/describe.c
> index 76c18059bf..37a83520c9 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -9,6 +9,7 @@
>  #include "diff.h"
>  #include "hashmap.h"
>  #include "argv-array.h"
> +#include "run-command.h"
>  
>  #define SEEN (1u << 0)
>  #define MAX_TAGS (FLAG_BITS - 1)
> @@ -31,12 +32,7 @@ static int have_util;
>  static struct string_list patterns = STRING_LIST_INIT_NODUP;
>  static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
>  static int always;
> -static const char *dirty;
> -
> -/* diff-index command arguments to check if working tree is dirty. */
> -static const char *diff_index_args[] = {
> - "diff-index", "--quiet", "HEAD", "--", NULL
> -};
> +static const char *append, *dirty, *broken;

Perhaps call it "suffix" or something?

> + if (broken) {
> + struct child_process cp = CHILD_PROCESS_INIT;
> + argv_array_pushl(&cp.args, "diff-index", "--quiet", 
> "HEAD", "--", NULL);
> ...
> + }
> + } else if (dirty) {
> + struct argv_array args = ARGV_ARRAY_INIT;
>   static struct lock_file index_lock;
>   int fd;
>  
> + argv_array_pushl(&args, "diff-index", "--quiet", 
> "HEAD", "--", NULL);


Somehow it looks like the patch is making the code a lot worse by
losing diff_index_args[] and duplicating the same command line twice
in the code.  Wouldn't argv_array_pushv() into these two different args
array from the same template work better?