Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-17 Thread Brandon Williams
On 11/15, Stefan Beller wrote:
> On Fri, Nov 11, 2016 at 3:51 PM, Brandon Williams  wrote:
> > +   /*
> > +* Add basename of parent project
> > +* When performing grep on a  object the filename is prefixed
> > +* with the object's name: ':filename'.
> 
> This comment is hard to read as it's unclear what the  mean.
> (Are the supposed to indicate a variable? If so why is file name not marked 
> up?)

Yeah you're right, the angle brackets don't really add anything to the
comment.  I'll drop them.

> >  In order to
> > +* provide uniformity of output we want to pass the name of the
> > +* parent project's object name to the submodule so the submodule 
> > can
> > +* prefix its output with the parent's name and not its own SHA1.
> > +*/
> > +   if (end_of_base)
> > +   argv_array_pushf(, "--parent-basename=%.*s",
> > +(int) (end_of_base - gs->name),
> > +gs->name);
> 
> Do we pass this only with the tree-ish?
> What if we are grepping the working tree and the file name contains a colon?

Actually you're right, this would only happen if we are passing a
tree-ish, which has a tree-name prefixed to the filename.  I'll add that
as an additional check to ensure that this handles file names with a
colon correctlythough why you have a colon in a filename is beyond
me :P

> > +test_expect_success 'grep tree HEAD^' '
> > +   cat >expect <<-\EOF &&
> > +   HEAD^:a:foobar
> > +   HEAD^:b/b:bar
> > +   HEAD^:submodule/a:foobar
> > +   EOF
> > +
> > +   git grep -e "bar" --recurse-submodules HEAD^ > actual &&
> > +   test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'grep tree HEAD^^' '
> > +   cat >expect <<-\EOF &&
> > +   HEAD^^:a:foobar
> > +   HEAD^^:b/b:bar
> > +   EOF
> > +
> > +   git grep -e "bar" --recurse-submodules HEAD^^ > actual &&
> > +   test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'grep tree and pathspecs' '
> > +   cat >expect <<-\EOF &&
> > +   HEAD:submodule/a:foobar
> > +   HEAD:submodule/sub/a:foobar
> > +   EOF
> > +
> > +   git grep -e "bar" --recurse-submodules HEAD -- submodule > actual &&
> > +   test_cmp expect actual
> > +'
> 
> Mind to add tests for
> * recursive submodules (say 2 levels), preferrably not having the
>   gitlink at the root each, i.e. root has a sub1 at path subs/sub1 and
> sub1 has a sub2
>   at path subs/sub2, such that recursing would produce a path like
>   HEAD:subs/sub1/subs/sub2/dir/file ?
> * file names with a colon in it
> * instead of just HEAD referencing trees, maybe a sha1 referenced test as well
>   (though it is not immediately clear what the benefit would be)
> * what if the submodule doesn't have the commit referenced in the given sha1

I'll add more tests too!

-- 
Brandon Williams


Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-15 Thread Stefan Beller
On Fri, Nov 11, 2016 at 3:51 PM, Brandon Williams  wrote:

> to:
> HEAD:file
> HEAD:sub/file

  Maybe indent this ;)

>  static struct argv_array submodule_options = ARGV_ARRAY_INIT;
> +static const char *parent_basename;
>
>  static int grep_submodule_launch(struct grep_opt *opt,
>  const struct grep_source *gs);
> @@ -535,19 +537,53 @@ static int grep_submodule_launch(struct grep_opt *opt,
>  {
> struct child_process cp = CHILD_PROCESS_INIT;
> int status, i;
> +   const char *end_of_base;
> +   const char *name;
> struct work_item *w = opt->output_priv;
>
> +   end_of_base = strchr(gs->name, ':');
> +   if (end_of_base)
> +   name = end_of_base + 1;
> +   else
> +   name = gs->name;
> +
> prepare_submodule_repo_env(_array);
>
> /* Add super prefix */
> argv_array_pushf(, "--super-prefix=%s%s/",
>  super_prefix ? super_prefix : "",
> -gs->name);
> +name);
> argv_array_push(, "grep");
>
> +   /*
> +* Add basename of parent project
> +* When performing grep on a  object the filename is prefixed
> +* with the object's name: ':filename'.

This comment is hard to read as it's unclear what the  mean.
(Are the supposed to indicate a variable? If so why is file name not marked up?)

>  In order to
> +* provide uniformity of output we want to pass the name of the
> +* parent project's object name to the submodule so the submodule can
> +* prefix its output with the parent's name and not its own SHA1.
> +*/
> +   if (end_of_base)
> +   argv_array_pushf(, "--parent-basename=%.*s",
> +(int) (end_of_base - gs->name),
> +gs->name);

Do we pass this only with the tree-ish?
What if we are grepping the working tree and the file name contains a colon?

> +test_expect_success 'grep tree HEAD^' '
> +   cat >expect <<-\EOF &&
> +   HEAD^:a:foobar
> +   HEAD^:b/b:bar
> +   HEAD^:submodule/a:foobar
> +   EOF
> +
> +   git grep -e "bar" --recurse-submodules HEAD^ > actual &&
> +   test_cmp expect actual
> +'
> +
> +test_expect_success 'grep tree HEAD^^' '
> +   cat >expect <<-\EOF &&
> +   HEAD^^:a:foobar
> +   HEAD^^:b/b:bar
> +   EOF
> +
> +   git grep -e "bar" --recurse-submodules HEAD^^ > actual &&
> +   test_cmp expect actual
> +'
> +
> +test_expect_success 'grep tree and pathspecs' '
> +   cat >expect <<-\EOF &&
> +   HEAD:submodule/a:foobar
> +   HEAD:submodule/sub/a:foobar
> +   EOF
> +
> +   git grep -e "bar" --recurse-submodules HEAD -- submodule > actual &&
> +   test_cmp expect actual
> +'

Mind to add tests for
* recursive submodules (say 2 levels), preferrably not having the
  gitlink at the root each, i.e. root has a sub1 at path subs/sub1 and
sub1 has a sub2
  at path subs/sub2, such that recursing would produce a path like
  HEAD:subs/sub1/subs/sub2/dir/file ?
* file names with a colon in it
* instead of just HEAD referencing trees, maybe a sha1 referenced test as well
  (though it is not immediately clear what the benefit would be)
* what if the submodule doesn't have the commit referenced in the given sha1

Thanks,
Stefan


Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-14 Thread Brandon Williams
On 11/14, Jonathan Tan wrote:
> On 11/14/2016 10:56 AM, Junio C Hamano wrote:
> >Jonathan Tan  writes:
> >
> to:
> HEAD:file
> HEAD:sub/file
> 
> Signed-off-by: Brandon Williams 
> ---
> >>>
> >>>Unrelated tangent, but this makes readers wonder what the updated
> >>>trailer code would do to the last paragraph ;-).  Does it behave
> >>>sensibly (with some sane definition of sensibleness)?
> >>>
> >>>I am guessing that it would, because neither To: or HEAD: is what we
> >>>normally recognize as a known trailer block element.
> >>
> >>Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a
> >>blank line, so the trailer block consists only of that line.
> >
> >Oh, that was not what I was wondering.  Imagine Brandon writing his
> >message that ends in these three questionable lines and then running
> >"commit -s --amend" to add his sign-off---that was the case I was
> >wondering.
> 
> Ah, I see. In that case, it would consider the last block as a
> trailer block and attach it directly:
> 
>   to:
>   HEAD:file
>   HEAD:sub/file
>   Signed-off-by: ...
> 
> It is true that neither to: nor HEAD: are known trailers, but my
> patch set accepts trailer blocks that are 100% well-formed
> regardless of whether the trailers are known (to provide backwards
> compatibility with git-interpret-trailers, and to satisfy the
> certain use cases that I brought up). The "known trailer" check is
> used when the trailer block is not 100% well-formed.
> 
> This issue can be avoided if those lines were indented with at least
> one space or at least one tab.

Who would have thought my simple example would cause this kind of
discussion!  I can update the commit message and indent the output so
that it looks like the following:

to:
  HEAD:file
  HEAD:sub/file

-- 
Brandon Williams


Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-14 Thread Jonathan Tan

On 11/14/2016 10:56 AM, Junio C Hamano wrote:

Jonathan Tan  writes:


to:
HEAD:file
HEAD:sub/file

Signed-off-by: Brandon Williams 
---


Unrelated tangent, but this makes readers wonder what the updated
trailer code would do to the last paragraph ;-).  Does it behave
sensibly (with some sane definition of sensibleness)?

I am guessing that it would, because neither To: or HEAD: is what we
normally recognize as a known trailer block element.


Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a
blank line, so the trailer block consists only of that line.


Oh, that was not what I was wondering.  Imagine Brandon writing his
message that ends in these three questionable lines and then running
"commit -s --amend" to add his sign-off---that was the case I was
wondering.


Ah, I see. In that case, it would consider the last block as a trailer 
block and attach it directly:


  to:
  HEAD:file
  HEAD:sub/file
  Signed-off-by: ...

It is true that neither to: nor HEAD: are known trailers, but my patch 
set accepts trailer blocks that are 100% well-formed regardless of 
whether the trailers are known (to provide backwards compatibility with 
git-interpret-trailers, and to satisfy the certain use cases that I 
brought up). The "known trailer" check is used when the trailer block is 
not 100% well-formed.


This issue can be avoided if those lines were indented with at least one 
space or at least one tab.


Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-14 Thread Junio C Hamano
Jonathan Tan  writes:

>>> to:
>>> HEAD:file
>>> HEAD:sub/file
>>>
>>> Signed-off-by: Brandon Williams 
>>> ---
>>
>> Unrelated tangent, but this makes readers wonder what the updated
>> trailer code would do to the last paragraph ;-).  Does it behave
>> sensibly (with some sane definition of sensibleness)?
>>
>> I am guessing that it would, because neither To: or HEAD: is what we
>> normally recognize as a known trailer block element.
>
> Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a
> blank line, so the trailer block consists only of that line.

Oh, that was not what I was wondering.  Imagine Brandon writing his
message that ends in these three questionable lines and then running
"commit -s --amend" to add his sign-off---that was the case I was
wondering.



Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-14 Thread Jonathan Tan

On 11/14/2016 10:10 AM, Junio C Hamano wrote:

Brandon Williams  writes:


Teach grep to recursively search in submodules when provided with a
 object. This allows grep to search a submodule based on the state
of the submodule that is present in a commit of the super project.

When grep is provided with a  object, the name of the object is
prefixed to all output.  In order to provide uniformity of output
between the parent and child processes the option `--parent-basename`
has been added so that the child can preface all of it's output with the
name of the parent's object instead of the name of the commit SHA1 of
the submodule. This changes output from the command
`git grep -e. -l --recurse-submodules HEAD` from:
HEAD:file
:sub/file

to:
HEAD:file
HEAD:sub/file

Signed-off-by: Brandon Williams 
---


Unrelated tangent, but this makes readers wonder what the updated
trailer code would do to the last paragraph ;-).  Does it behave
sensibly (with some sane definition of sensibleness)?

I am guessing that it would, because neither To: or HEAD: is what we
normally recognize as a known trailer block element.


Yes, it behaves sensibly :-) because "Signed-off-by:" is preceded by a 
blank line, so the trailer block consists only of that line.


Having said that, it is probably better to indent those examples in the 
commit message (by at least one space or one tab) - then they will never 
be confused with trailers (once my patch set is in).


Re: [PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-14 Thread Junio C Hamano
Brandon Williams  writes:

> Teach grep to recursively search in submodules when provided with a
>  object. This allows grep to search a submodule based on the state
> of the submodule that is present in a commit of the super project.
>
> When grep is provided with a  object, the name of the object is
> prefixed to all output.  In order to provide uniformity of output
> between the parent and child processes the option `--parent-basename`
> has been added so that the child can preface all of it's output with the
> name of the parent's object instead of the name of the commit SHA1 of
> the submodule. This changes output from the command
> `git grep -e. -l --recurse-submodules HEAD` from:
> HEAD:file
> :sub/file
>
> to:
> HEAD:file
> HEAD:sub/file
>
> Signed-off-by: Brandon Williams 
> ---

Unrelated tangent, but this makes readers wonder what the updated
trailer code would do to the last paragraph ;-).  Does it behave
sensibly (with some sane definition of sensibleness)?

I am guessing that it would, because neither To: or HEAD: is what we
normally recognize as a known trailer block element.




[PATCH v3 5/6] grep: enable recurse-submodules to work on objects

2016-11-11 Thread Brandon Williams
Teach grep to recursively search in submodules when provided with a
 object. This allows grep to search a submodule based on the state
of the submodule that is present in a commit of the super project.

When grep is provided with a  object, the name of the object is
prefixed to all output.  In order to provide uniformity of output
between the parent and child processes the option `--parent-basename`
has been added so that the child can preface all of it's output with the
name of the parent's object instead of the name of the commit SHA1 of
the submodule. This changes output from the command
`git grep -e. -l --recurse-submodules HEAD` from:
HEAD:file
:sub/file

to:
HEAD:file
HEAD:sub/file

Signed-off-by: Brandon Williams 
---
 Documentation/git-grep.txt | 13 +-
 builtin/grep.c | 83 +++---
 t/t7814-grep-recurse-submodules.sh | 44 +++-
 3 files changed, 131 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 17aa1ba..71f32f3 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -26,7 +26,7 @@ SYNOPSIS
   [--threads ]
   [-f ] [-e] 
   [--and|--or|--not|(|)|-e ...]
-  [--recurse-submodules]
+  [--recurse-submodules] [--parent-basename ]
   [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | 
...]
   [--] [...]
 
@@ -91,7 +91,16 @@ OPTIONS
 
 --recurse-submodules::
Recursively search in each submodule that has been initialized and
-   checked out in the repository.
+   checked out in the repository.  When used in combination with the
+option the prefix of all submodule output will be the name of
+   the parent project's  object.
+
+--parent-basename ::
+   For internal use only.  In order to produce uniform output with the
+   --recurse-submodules option, this option can be used to provide the
+   basename of a parent's  object to a submodule so the submodule
+   can prefix its output with the parent's name rather than the SHA1 of
+   the submodule.
 
 -a::
 --text::
diff --git a/builtin/grep.c b/builtin/grep.c
index 1fd292f..93e5405 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -19,6 +19,7 @@
 #include "dir.h"
 #include "pathspec.h"
 #include "submodule.h"
+#include "submodule-config.h"
 
 static char const * const grep_usage[] = {
N_("git grep [] [-e]  [...] [[--] ...]"),
@@ -28,6 +29,7 @@ static char const * const grep_usage[] = {
 static const char *super_prefix;
 static int recurse_submodules;
 static struct argv_array submodule_options = ARGV_ARRAY_INIT;
+static const char *parent_basename;
 
 static int grep_submodule_launch(struct grep_opt *opt,
 const struct grep_source *gs);
@@ -535,19 +537,53 @@ static int grep_submodule_launch(struct grep_opt *opt,
 {
struct child_process cp = CHILD_PROCESS_INIT;
int status, i;
+   const char *end_of_base;
+   const char *name;
struct work_item *w = opt->output_priv;
 
+   end_of_base = strchr(gs->name, ':');
+   if (end_of_base)
+   name = end_of_base + 1;
+   else
+   name = gs->name;
+
prepare_submodule_repo_env(_array);
 
/* Add super prefix */
argv_array_pushf(, "--super-prefix=%s%s/",
 super_prefix ? super_prefix : "",
-gs->name);
+name);
argv_array_push(, "grep");
 
+   /*
+* Add basename of parent project
+* When performing grep on a  object the filename is prefixed
+* with the object's name: ':filename'.  In order to
+* provide uniformity of output we want to pass the name of the
+* parent project's object name to the submodule so the submodule can
+* prefix its output with the parent's name and not its own SHA1.
+*/
+   if (end_of_base)
+   argv_array_pushf(, "--parent-basename=%.*s",
+(int) (end_of_base - gs->name),
+gs->name);
+
/* Add options */
-   for (i = 0; i < submodule_options.argc; i++)
+   for (i = 0; i < submodule_options.argc; i++) {
+   /*
+* If there is a  identifier for the submodule, add the
+* rev after adding the submodule options but before the
+* pathspecs.  To do this we listen for the '--' and insert the
+* sha1 before pushing the '--' onto the child process argv
+* array.
+*/
+   if (gs->identifier &&
+   !strcmp("--", submodule_options.argv[i])) {
+   argv_array_push(, sha1_to_hex(gs->identifier));
+   }
+
argv_array_push(, submodule_options.argv[i]);
+   }
 
cp.git_cmd = 1;