Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-14 Thread Michael J Gruber
Oh, my gosh, it's not as early as this indicates, rather coffein
depletion already:

> Message-ID: <5052e1c2.1050...@pobox.com>
> Date: Fri, 14 Sep 2012 09:50:26 +0200
> From: Junio C Hamano 
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120827 
> Thunderbird/15.0

[no, that Thunderbird user isn't Junio]

> MIME-Version: 1.0
> To: Michael J Gruber 
> CC: git@vger.kernel.org
> Subject: Re: [PATCH] rev-list/log: document logic with several limiting 
> options

> Michael J Gruber venit, vidit, dixit 14.09.2012 09:46:
> [snipped, just adding]
> 
> ...and maybe the meaning of "(or ...)" and "*or*" isn't what I think it
> is either?

[and that confused persion isn't Junio either]

I didn't mean to imposter Junio. Something went wrong with "virtual
identity" choosing the From: address. Terribly sorry.

Michael (really...)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-14 Thread Junio C Hamano
Michael J Gruber venit, vidit, dixit 14.09.2012 09:46:
[snipped, just adding]

...and maybe the meaning of "(or ...)" and "*or*" isn't what I think it
is either?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-14 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 13.09.2012 23:21:
> Michael J Gruber  writes:
> 
>> Thanks for "this" ;)
> 
> Here is a replacement to "this", that adds the "--debug" option to
> "git grep" and an equivalent "--debug-grep" to "git log" family.
> 
> -- >8 --
> Subject: [PATCH] grep: teach --debug option to dump the parse tree
> 
> Our "grep" allows complex boolean expressions to be formed to match
> each individual line with operators like --and, '(', ')' and --not.
> Introduce the "--debug" option to show the parse tree to help people
> who want to debug and enhance it.
> 
> Also "log" learns "--debug-grep" option to do the same.  The command
> line parser to the log family is a lot more limited than the general
> "git grep" parser, but it has special handling for header matching
> (e.g. "--author"), and a parse tree is valuable when working on it.
> 
> Note that "--all-match" is *not* any individual node in the parse
> tree.  It is an instruction to the evaluator to check all the nodes
> in the top-level backbone have matched and reject a document as
> non-matching otherwise.

I haven't read all your responses yet, but the/my main confusion comes
from all-match. My understanding is:

--all-match == turn all top-level ORs into ANDs

(unless it's all --author at the top-level; and I'm referring to user
supplied ORs, not those added by the implementation)

Seing (OR foo true) being evaluated to false when foo is false and
all-match is in effect is terribly confusing, me thinks (debug output).

In fact, I think the behavior described above (if it's correct) is a
product of the evolution of grep.c and the current implementation
(turning former unions into intersections in some cases, inserting that
TRUE node), but not necessarily the intended or preferrable outcome in
all corner cases.

We AND different types of limit options in any case (this used to be
affected by --all-match[?]), and we OR --author options in any case. So
having --all-match turn "--grep=foo --grep=bar" from OR into AND would
make the most sense at least to me (and I mean: independent of the
presence of an --author option).

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-13 Thread Junio C Hamano
Michael J Gruber  writes:

> Thanks for "this" ;)

Here is a replacement to "this", that adds the "--debug" option to
"git grep" and an equivalent "--debug-grep" to "git log" family.

-- >8 --
Subject: [PATCH] grep: teach --debug option to dump the parse tree

Our "grep" allows complex boolean expressions to be formed to match
each individual line with operators like --and, '(', ')' and --not.
Introduce the "--debug" option to show the parse tree to help people
who want to debug and enhance it.

Also "log" learns "--debug-grep" option to do the same.  The command
line parser to the log family is a lot more limited than the general
"git grep" parser, but it has special handling for header matching
(e.g. "--author"), and a parse tree is valuable when working on it.

Note that "--all-match" is *not* any individual node in the parse
tree.  It is an instruction to the evaluator to check all the nodes
in the top-level backbone have matched and reject a document as
non-matching otherwise.

Signed-off-by: Junio C Hamano 
---
 builtin/grep.c |  3 ++
 grep.c | 93 --
 grep.h |  1 +
 revision.c |  2 ++
 4 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 09ca4c9..1c6b95a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -817,6 +817,9 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
   N_("indicate hit with exit status without output")),
OPT_BOOLEAN(0, "all-match", &opt.all_match,
N_("show only matches from files that match all 
patterns")),
+   { OPTION_SET_INT, 0, "debug", &opt.debug, NULL,
+ N_("show parse tree for grep expression"),
+ PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, NULL, 1 },
OPT_GROUP(""),
{ OPTION_STRING, 'O', "open-files-in-pager", &show_in_pager,
N_("pager"), N_("show matching files in the pager"),
diff --git a/grep.c b/grep.c
index 04e3ec6..f896d68 100644
--- a/grep.c
+++ b/grep.c
@@ -332,6 +332,87 @@ static struct grep_expr *compile_pattern_expr(struct 
grep_pat **list)
return compile_pattern_or(list);
 }
 
+static void indent(int in)
+{
+   while (in-- > 0)
+   fputc(' ', stderr);
+}
+
+static void dump_grep_pat(struct grep_pat *p)
+{
+   switch (p->token) {
+   case GREP_AND: fprintf(stderr, "*and*"); break;
+   case GREP_OPEN_PAREN: fprintf(stderr, "*(*"); break;
+   case GREP_CLOSE_PAREN: fprintf(stderr, "*)*"); break;
+   case GREP_NOT: fprintf(stderr, "*not*"); break;
+   case GREP_OR: fprintf(stderr, "*or*"); break;
+
+   case GREP_PATTERN: fprintf(stderr, "pattern"); break;
+   case GREP_PATTERN_HEAD: fprintf(stderr, "pattern_head"); break;
+   case GREP_PATTERN_BODY: fprintf(stderr, "pattern_body"); break;
+   }
+
+   switch (p->token) {
+   default: break;
+   case GREP_PATTERN_HEAD:
+   fprintf(stderr, "", p->field); break;
+   case GREP_PATTERN_BODY:
+   fprintf(stderr, ""); break;
+   }
+   switch (p->token) {
+   default: break;
+   case GREP_PATTERN_HEAD:
+   case GREP_PATTERN_BODY:
+   case GREP_PATTERN:
+   fprintf(stderr, "%.*s", (int)p->patternlen, p->pattern);
+   break;
+   }
+   fputc('\n', stderr);
+}
+
+static void dump_grep_expression_1(struct grep_expr *x, int in)
+{
+   indent(in);
+   switch (x->node) {
+   case GREP_NODE_TRUE:
+   fprintf(stderr, "true\n");
+   break;
+   case GREP_NODE_ATOM:
+   dump_grep_pat(x->u.atom);
+   break;
+   case GREP_NODE_NOT:
+   fprintf(stderr, "(not\n");
+   dump_grep_expression_1(x->u.unary, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   case GREP_NODE_AND:
+   fprintf(stderr, "(and\n");
+   dump_grep_expression_1(x->u.binary.left, in+1);
+   dump_grep_expression_1(x->u.binary.right, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   case GREP_NODE_OR:
+   fprintf(stderr, "(or\n");
+   dump_grep_expression_1(x->u.binary.left, in+1);
+   dump_grep_expression_1(x->u.binary.right, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   }
+}
+
+void dump_grep_expression(struct grep_opt *opt)
+{
+   struct grep_expr *x = opt->pattern_expression;
+
+   if (opt->all_match)
+   fprintf(stderr, "[all-match]\n");
+   dump_grep_expression_1(x, 0);
+   fflush(NULL);
+}
+
 static struct grep_expr *grep_true_expr(void)
 {
struct grep_expr *z = xcalloc(1, sizeof(*z));
@@ -395,7 +476,7 @@ static struct grep_expr *prep_header_patterns(struct 

Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-13 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 12.09.2012 19:25:
> Michael J Gruber  writes:
> 
>> It was introduced in 0ab7befa with a clear meaning (AND everything),
>> then the general logic (without --all-match) was modified in 80235ba7
>> (to take headermatch AND (all greps ORed)), and 5aaeb733 finally made
>> multiple authors resp. committers get ORed among each other. All of this
>> in an attempt to make the standard usage most useful, of course. As a
>> consequence, --all-match does not influence multiple --author options at
>> all (contrary to the doc), e.g.
>>
>> I don't see any of this reflected in the doc, though. I noticed only by
>> reading t/t7810-grep.sh. Before that, I had only gone by my own testing
>> which didn't reveal the multiple author multiple special casing effect.
>>
>> I guess I'll have to wrap my head around the current implementation a
>> few more times before trying to describe the status quo in the
>> documentation...
> 
> This is what I used to use when adding these generalized grep
> boolean expressions.
> 
> With this applied, you can try things like these:
> 
> $ git log --grep=regexp --grep=nosuch --all-match >/dev/null
> $ git log --grep=regexp --grep=nosuch --author=Michael >/dev/null
> 
> For example, "--all-match --grep=regexp --author=Michael --author=Linus" turns
> into
> 
> [all-match]
> (or
>  pattern_bodyregexp
>  (or
>   (or
>pattern_headLinus
>pattern_headMichael
>   )
>   true
>  )
> )
> 
> that says "body must have 'regexp' in it, and authored by either
> Linus or Michael".
> 
> The semantics of "--all-match" is different from "--and" (which,
> together with "--or", ")", and "(", is not availble to rev-list
> command line parser primarily because "--not" is not available).
> After applying all the "or"-ed terms, it checks the top-level nodes
> that are "or"-ed and makes sure all of them fired (possibly and
> usually on different lines).

Thanks for "this" ;)

I'll recheck my understanding and update the doc then. Maybe even
putting the above in-tree with a "--debug" option seems inline with
things such as "git describe --debug" (and maybe a preparation for
exposing a richer interface).

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-12 Thread Junio C Hamano
Junio C Hamano  writes:

> This is what I used to use when adding these generalized grep
> boolean expressions.
>
> With this applied,...

And this is the "this" X-<.

 grep.c | 90 +-
 1 file changed, 89 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 04e3ec6..566c4cc 100644
--- a/grep.c
+++ b/grep.c
@@ -332,6 +332,87 @@ static struct grep_expr *compile_pattern_expr(struct 
grep_pat **list)
return compile_pattern_or(list);
 }
 
+static void indent(int in)
+{
+   while (in-- > 0)
+   fputc(' ', stderr);
+}
+
+static void dump_grep_pat(struct grep_pat *p)
+{
+   switch (p->token) {
+   case GREP_AND: fprintf(stderr, "*and*"); break;
+   case GREP_OPEN_PAREN: fprintf(stderr, "*(*"); break;
+   case GREP_CLOSE_PAREN: fprintf(stderr, "*)*"); break;
+   case GREP_NOT: fprintf(stderr, "*not*"); break;
+   case GREP_OR: fprintf(stderr, "*or*"); break;
+
+   case GREP_PATTERN: fprintf(stderr, "pattern"); break;
+   case GREP_PATTERN_HEAD: fprintf(stderr, "pattern_head"); break;
+   case GREP_PATTERN_BODY: fprintf(stderr, "pattern_body"); break;
+   }
+
+   switch (p->token) {
+   default: break;
+   case GREP_PATTERN_HEAD:
+   fprintf(stderr, "", p->field); break;
+   case GREP_PATTERN_BODY:
+   fprintf(stderr, ""); break;
+   }
+   switch (p->token) {
+   default: break;
+   case GREP_PATTERN_HEAD:
+   case GREP_PATTERN_BODY:
+   case GREP_PATTERN:
+   fprintf(stderr, "%.*s", (int)p->patternlen, p->pattern);
+   break;
+   }
+   fputc('\n', stderr);
+}
+
+static void dump_grep_expression_1(struct grep_expr *x, int in)
+{
+   indent(in);
+   switch (x->node) {
+   case GREP_NODE_TRUE:
+   fprintf(stderr, "true\n");
+   break;
+   case GREP_NODE_ATOM:
+   dump_grep_pat(x->u.atom);
+   break;
+   case GREP_NODE_NOT:
+   fprintf(stderr, "(not\n");
+   dump_grep_expression_1(x->u.unary, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   case GREP_NODE_AND:
+   fprintf(stderr, "(and\n");
+   dump_grep_expression_1(x->u.binary.left, in+1);
+   dump_grep_expression_1(x->u.binary.right, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   case GREP_NODE_OR:
+   fprintf(stderr, "(or\n");
+   dump_grep_expression_1(x->u.binary.left, in+1);
+   dump_grep_expression_1(x->u.binary.right, in+1);
+   indent(in);
+   fprintf(stderr, ")\n");
+   break;
+   }
+}
+
+void dump_grep_expression(struct grep_opt *opt)
+{
+   struct grep_expr *x = opt->pattern_expression;
+
+   if (opt->all_match)
+   fprintf(stderr, "[all-match]\n");
+   dump_grep_expression_1(x, 0);
+   fflush(NULL);
+}
+
 static struct grep_expr *grep_true_expr(void)
 {
struct grep_expr *z = xcalloc(1, sizeof(*z));
@@ -395,7 +476,7 @@ static struct grep_expr *prep_header_patterns(struct 
grep_opt *opt)
return header_expr;
 }
 
-void compile_grep_patterns(struct grep_opt *opt)
+static void compile_grep_patterns_real(struct grep_opt *opt)
 {
struct grep_pat *p;
struct grep_expr *header_expr = prep_header_patterns(opt);
@@ -432,9 +513,16 @@ void compile_grep_patterns(struct grep_opt *opt)
else
opt->pattern_expression = grep_or_expr(opt->pattern_expression,
   header_expr);
+
opt->all_match = 1;
 }
 
+void compile_grep_patterns(struct grep_opt *opt)
+{
+   compile_grep_patterns_real(opt);
+   dump_grep_expression(opt);
+}
+
 static void free_pattern_expr(struct grep_expr *x)
 {
switch (x->node) {
-- 
1.7.12.414.g1a62b7a

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-12 Thread Junio C Hamano
Michael J Gruber  writes:

> It was introduced in 0ab7befa with a clear meaning (AND everything),
> then the general logic (without --all-match) was modified in 80235ba7
> (to take headermatch AND (all greps ORed)), and 5aaeb733 finally made
> multiple authors resp. committers get ORed among each other. All of this
> in an attempt to make the standard usage most useful, of course. As a
> consequence, --all-match does not influence multiple --author options at
> all (contrary to the doc), e.g.
>
> I don't see any of this reflected in the doc, though. I noticed only by
> reading t/t7810-grep.sh. Before that, I had only gone by my own testing
> which didn't reveal the multiple author multiple special casing effect.
>
> I guess I'll have to wrap my head around the current implementation a
> few more times before trying to describe the status quo in the
> documentation...

This is what I used to use when adding these generalized grep
boolean expressions.

With this applied, you can try things like these:

$ git log --grep=regexp --grep=nosuch --all-match >/dev/null
$ git log --grep=regexp --grep=nosuch --author=Michael >/dev/null

For example, "--all-match --grep=regexp --author=Michael --author=Linus" turns
into

[all-match]
(or
 pattern_bodyregexp
 (or
  (or
   pattern_headLinus
   pattern_headMichael
  )
  true
 )
)

that says "body must have 'regexp' in it, and authored by either
Linus or Michael".

The semantics of "--all-match" is different from "--and" (which,
together with "--or", ")", and "(", is not availble to rev-list
command line parser primarily because "--not" is not available).
After applying all the "or"-ed terms, it checks the top-level nodes
that are "or"-ed and makes sure all of them fired (possibly and
usually on different lines).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-12 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 11.09.2012 18:22:
> Michael J Gruber  writes:
> 
>> The current behavior is probably as useful as it is confusing. In any
>> case it is going to stay. So, document it.
>>
>> Signed-off-by: Michael J Gruber 
>> ---
>> I would have written a test but don't really know where to stick it in.
>> rev-list has many small tests where it doesn't fit.
> 
> "git log" would be the more natural place, I would say.

Well, to me, "git log" is the ui to "git rev-list", so everything which
tests the revision walker should be tested using "git rev-list". But I
don't care as long as it fits in - and can be found later.

That being said, I now see lots of "log --grep" and "log --author" tests
in the grep-test. So I guess there it fits best (despite the misnomer).

>>
>>  Documentation/rev-list-options.txt | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/Documentation/rev-list-options.txt 
>> b/Documentation/rev-list-options.txt
>> index 5436eba..9c13df3 100644
>> --- a/Documentation/rev-list-options.txt
>> +++ b/Documentation/rev-list-options.txt
>> @@ -6,6 +6,12 @@ special notations explained in the description, additional 
>> commit
>>  limiting may be applied. Note that they are applied before commit
>>  ordering and formatting options, such as '--reverse'.
>>  
>> +All occurrences of the same option are ORed: '--grep=foo --grep=bar'
>> +limits to commits which match at least one of these conditions.
> 
> How would this interact with "--all-match"?

Badly.

It was introduced in 0ab7befa with a clear meaning (AND everything),
then the general logic (without --all-match) was modified in 80235ba7
(to take headermatch AND (all greps ORed)), and 5aaeb733 finally made
multiple authors resp. committers get ORed among each other. All of this
in an attempt to make the standard usage most useful, of course. As a
consequence, --all-match does not influence multiple --author options at
all (contrary to the doc), e.g.

I don't see any of this reflected in the doc, though. I noticed only by
reading t/t7810-grep.sh. Before that, I had only gone by my own testing
which didn't reveal the multiple author multiple special casing effect.

I guess I'll have to wrap my head around the current implementation a
few more times before trying to describe the status quo in the
documentation...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rev-list/log: document logic with several limiting options

2012-09-11 Thread Junio C Hamano
Michael J Gruber  writes:

> The current behavior is probably as useful as it is confusing. In any
> case it is going to stay. So, document it.
>
> Signed-off-by: Michael J Gruber 
> ---
> I would have written a test but don't really know where to stick it in.
> rev-list has many small tests where it doesn't fit.

"git log" would be the more natural place, I would say.

>
>  Documentation/rev-list-options.txt | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/rev-list-options.txt 
> b/Documentation/rev-list-options.txt
> index 5436eba..9c13df3 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -6,6 +6,12 @@ special notations explained in the description, additional 
> commit
>  limiting may be applied. Note that they are applied before commit
>  ordering and formatting options, such as '--reverse'.
>  
> +All occurrences of the same option are ORed: '--grep=foo --grep=bar'
> +limits to commits which match at least one of these conditions.

How would this interact with "--all-match"?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] rev-list/log: document logic with several limiting options

2012-09-11 Thread Michael J Gruber
The current behavior is probably as useful as it is confusing. In any
case it is going to stay. So, document it.

Signed-off-by: Michael J Gruber 
---
I would have written a test but don't really know where to stick it in.
rev-list has many small tests where it doesn't fit.

 Documentation/rev-list-options.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 5436eba..9c13df3 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -6,6 +6,12 @@ special notations explained in the description, additional 
commit
 limiting may be applied. Note that they are applied before commit
 ordering and formatting options, such as '--reverse'.
 
+All occurrences of the same option are ORed: '--grep=foo --grep=bar'
+limits to commits which match at least one of these conditions.
+
+Different options are ANDed: '--author=bar --grep=foo'
+limits to commits which match both conditions.
+
 --
 
 -n 'number'::
-- 
1.7.12.463.gbd9d638

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html