Re: [PATCH] rev-list/log: document logic with several limiting options
Oh, my gosh, it's not as early as this indicates, rather coffein depletion already: > Message-ID: <5052e1c2.1050...@pobox.com> > Date: Fri, 14 Sep 2012 09:50:26 +0200 > From: Junio C Hamano > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120827 > Thunderbird/15.0 [no, that Thunderbird user isn't Junio] > MIME-Version: 1.0 > To: Michael J Gruber > CC: git@vger.kernel.org > Subject: Re: [PATCH] rev-list/log: document logic with several limiting > options > Michael J Gruber venit, vidit, dixit 14.09.2012 09:46: > [snipped, just adding] > > ...and maybe the meaning of "(or ...)" and "*or*" isn't what I think it > is either? [and that confused persion isn't Junio either] I didn't mean to imposter Junio. Something went wrong with "virtual identity" choosing the From: address. Terribly sorry. Michael (really...) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Michael J Gruber venit, vidit, dixit 14.09.2012 09:46: [snipped, just adding] ...and maybe the meaning of "(or ...)" and "*or*" isn't what I think it is either? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Junio C Hamano venit, vidit, dixit 13.09.2012 23:21: > Michael J Gruber writes: > >> Thanks for "this" ;) > > Here is a replacement to "this", that adds the "--debug" option to > "git grep" and an equivalent "--debug-grep" to "git log" family. > > -- >8 -- > Subject: [PATCH] grep: teach --debug option to dump the parse tree > > Our "grep" allows complex boolean expressions to be formed to match > each individual line with operators like --and, '(', ')' and --not. > Introduce the "--debug" option to show the parse tree to help people > who want to debug and enhance it. > > Also "log" learns "--debug-grep" option to do the same. The command > line parser to the log family is a lot more limited than the general > "git grep" parser, but it has special handling for header matching > (e.g. "--author"), and a parse tree is valuable when working on it. > > Note that "--all-match" is *not* any individual node in the parse > tree. It is an instruction to the evaluator to check all the nodes > in the top-level backbone have matched and reject a document as > non-matching otherwise. I haven't read all your responses yet, but the/my main confusion comes from all-match. My understanding is: --all-match == turn all top-level ORs into ANDs (unless it's all --author at the top-level; and I'm referring to user supplied ORs, not those added by the implementation) Seing (OR foo true) being evaluated to false when foo is false and all-match is in effect is terribly confusing, me thinks (debug output). In fact, I think the behavior described above (if it's correct) is a product of the evolution of grep.c and the current implementation (turning former unions into intersections in some cases, inserting that TRUE node), but not necessarily the intended or preferrable outcome in all corner cases. We AND different types of limit options in any case (this used to be affected by --all-match[?]), and we OR --author options in any case. So having --all-match turn "--grep=foo --grep=bar" from OR into AND would make the most sense at least to me (and I mean: independent of the presence of an --author option). Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Michael J Gruber writes: > Thanks for "this" ;) Here is a replacement to "this", that adds the "--debug" option to "git grep" and an equivalent "--debug-grep" to "git log" family. -- >8 -- Subject: [PATCH] grep: teach --debug option to dump the parse tree Our "grep" allows complex boolean expressions to be formed to match each individual line with operators like --and, '(', ')' and --not. Introduce the "--debug" option to show the parse tree to help people who want to debug and enhance it. Also "log" learns "--debug-grep" option to do the same. The command line parser to the log family is a lot more limited than the general "git grep" parser, but it has special handling for header matching (e.g. "--author"), and a parse tree is valuable when working on it. Note that "--all-match" is *not* any individual node in the parse tree. It is an instruction to the evaluator to check all the nodes in the top-level backbone have matched and reject a document as non-matching otherwise. Signed-off-by: Junio C Hamano --- builtin/grep.c | 3 ++ grep.c | 93 -- grep.h | 1 + revision.c | 2 ++ 4 files changed, 97 insertions(+), 2 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 09ca4c9..1c6b95a 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -817,6 +817,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) N_("indicate hit with exit status without output")), OPT_BOOLEAN(0, "all-match", &opt.all_match, N_("show only matches from files that match all patterns")), + { OPTION_SET_INT, 0, "debug", &opt.debug, NULL, + N_("show parse tree for grep expression"), + PARSE_OPT_NOARG | PARSE_OPT_HIDDEN, NULL, 1 }, OPT_GROUP(""), { OPTION_STRING, 'O', "open-files-in-pager", &show_in_pager, N_("pager"), N_("show matching files in the pager"), diff --git a/grep.c b/grep.c index 04e3ec6..f896d68 100644 --- a/grep.c +++ b/grep.c @@ -332,6 +332,87 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list) return compile_pattern_or(list); } +static void indent(int in) +{ + while (in-- > 0) + fputc(' ', stderr); +} + +static void dump_grep_pat(struct grep_pat *p) +{ + switch (p->token) { + case GREP_AND: fprintf(stderr, "*and*"); break; + case GREP_OPEN_PAREN: fprintf(stderr, "*(*"); break; + case GREP_CLOSE_PAREN: fprintf(stderr, "*)*"); break; + case GREP_NOT: fprintf(stderr, "*not*"); break; + case GREP_OR: fprintf(stderr, "*or*"); break; + + case GREP_PATTERN: fprintf(stderr, "pattern"); break; + case GREP_PATTERN_HEAD: fprintf(stderr, "pattern_head"); break; + case GREP_PATTERN_BODY: fprintf(stderr, "pattern_body"); break; + } + + switch (p->token) { + default: break; + case GREP_PATTERN_HEAD: + fprintf(stderr, "", p->field); break; + case GREP_PATTERN_BODY: + fprintf(stderr, ""); break; + } + switch (p->token) { + default: break; + case GREP_PATTERN_HEAD: + case GREP_PATTERN_BODY: + case GREP_PATTERN: + fprintf(stderr, "%.*s", (int)p->patternlen, p->pattern); + break; + } + fputc('\n', stderr); +} + +static void dump_grep_expression_1(struct grep_expr *x, int in) +{ + indent(in); + switch (x->node) { + case GREP_NODE_TRUE: + fprintf(stderr, "true\n"); + break; + case GREP_NODE_ATOM: + dump_grep_pat(x->u.atom); + break; + case GREP_NODE_NOT: + fprintf(stderr, "(not\n"); + dump_grep_expression_1(x->u.unary, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + case GREP_NODE_AND: + fprintf(stderr, "(and\n"); + dump_grep_expression_1(x->u.binary.left, in+1); + dump_grep_expression_1(x->u.binary.right, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + case GREP_NODE_OR: + fprintf(stderr, "(or\n"); + dump_grep_expression_1(x->u.binary.left, in+1); + dump_grep_expression_1(x->u.binary.right, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + } +} + +void dump_grep_expression(struct grep_opt *opt) +{ + struct grep_expr *x = opt->pattern_expression; + + if (opt->all_match) + fprintf(stderr, "[all-match]\n"); + dump_grep_expression_1(x, 0); + fflush(NULL); +} + static struct grep_expr *grep_true_expr(void) { struct grep_expr *z = xcalloc(1, sizeof(*z)); @@ -395,7 +476,7 @@ static struct grep_expr *prep_header_patterns(struct
Re: [PATCH] rev-list/log: document logic with several limiting options
Junio C Hamano venit, vidit, dixit 12.09.2012 19:25: > Michael J Gruber writes: > >> It was introduced in 0ab7befa with a clear meaning (AND everything), >> then the general logic (without --all-match) was modified in 80235ba7 >> (to take headermatch AND (all greps ORed)), and 5aaeb733 finally made >> multiple authors resp. committers get ORed among each other. All of this >> in an attempt to make the standard usage most useful, of course. As a >> consequence, --all-match does not influence multiple --author options at >> all (contrary to the doc), e.g. >> >> I don't see any of this reflected in the doc, though. I noticed only by >> reading t/t7810-grep.sh. Before that, I had only gone by my own testing >> which didn't reveal the multiple author multiple special casing effect. >> >> I guess I'll have to wrap my head around the current implementation a >> few more times before trying to describe the status quo in the >> documentation... > > This is what I used to use when adding these generalized grep > boolean expressions. > > With this applied, you can try things like these: > > $ git log --grep=regexp --grep=nosuch --all-match >/dev/null > $ git log --grep=regexp --grep=nosuch --author=Michael >/dev/null > > For example, "--all-match --grep=regexp --author=Michael --author=Linus" turns > into > > [all-match] > (or > pattern_bodyregexp > (or > (or >pattern_headLinus >pattern_headMichael > ) > true > ) > ) > > that says "body must have 'regexp' in it, and authored by either > Linus or Michael". > > The semantics of "--all-match" is different from "--and" (which, > together with "--or", ")", and "(", is not availble to rev-list > command line parser primarily because "--not" is not available). > After applying all the "or"-ed terms, it checks the top-level nodes > that are "or"-ed and makes sure all of them fired (possibly and > usually on different lines). Thanks for "this" ;) I'll recheck my understanding and update the doc then. Maybe even putting the above in-tree with a "--debug" option seems inline with things such as "git describe --debug" (and maybe a preparation for exposing a richer interface). Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Junio C Hamano writes: > This is what I used to use when adding these generalized grep > boolean expressions. > > With this applied,... And this is the "this" X-<. grep.c | 90 +- 1 file changed, 89 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index 04e3ec6..566c4cc 100644 --- a/grep.c +++ b/grep.c @@ -332,6 +332,87 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list) return compile_pattern_or(list); } +static void indent(int in) +{ + while (in-- > 0) + fputc(' ', stderr); +} + +static void dump_grep_pat(struct grep_pat *p) +{ + switch (p->token) { + case GREP_AND: fprintf(stderr, "*and*"); break; + case GREP_OPEN_PAREN: fprintf(stderr, "*(*"); break; + case GREP_CLOSE_PAREN: fprintf(stderr, "*)*"); break; + case GREP_NOT: fprintf(stderr, "*not*"); break; + case GREP_OR: fprintf(stderr, "*or*"); break; + + case GREP_PATTERN: fprintf(stderr, "pattern"); break; + case GREP_PATTERN_HEAD: fprintf(stderr, "pattern_head"); break; + case GREP_PATTERN_BODY: fprintf(stderr, "pattern_body"); break; + } + + switch (p->token) { + default: break; + case GREP_PATTERN_HEAD: + fprintf(stderr, "", p->field); break; + case GREP_PATTERN_BODY: + fprintf(stderr, ""); break; + } + switch (p->token) { + default: break; + case GREP_PATTERN_HEAD: + case GREP_PATTERN_BODY: + case GREP_PATTERN: + fprintf(stderr, "%.*s", (int)p->patternlen, p->pattern); + break; + } + fputc('\n', stderr); +} + +static void dump_grep_expression_1(struct grep_expr *x, int in) +{ + indent(in); + switch (x->node) { + case GREP_NODE_TRUE: + fprintf(stderr, "true\n"); + break; + case GREP_NODE_ATOM: + dump_grep_pat(x->u.atom); + break; + case GREP_NODE_NOT: + fprintf(stderr, "(not\n"); + dump_grep_expression_1(x->u.unary, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + case GREP_NODE_AND: + fprintf(stderr, "(and\n"); + dump_grep_expression_1(x->u.binary.left, in+1); + dump_grep_expression_1(x->u.binary.right, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + case GREP_NODE_OR: + fprintf(stderr, "(or\n"); + dump_grep_expression_1(x->u.binary.left, in+1); + dump_grep_expression_1(x->u.binary.right, in+1); + indent(in); + fprintf(stderr, ")\n"); + break; + } +} + +void dump_grep_expression(struct grep_opt *opt) +{ + struct grep_expr *x = opt->pattern_expression; + + if (opt->all_match) + fprintf(stderr, "[all-match]\n"); + dump_grep_expression_1(x, 0); + fflush(NULL); +} + static struct grep_expr *grep_true_expr(void) { struct grep_expr *z = xcalloc(1, sizeof(*z)); @@ -395,7 +476,7 @@ static struct grep_expr *prep_header_patterns(struct grep_opt *opt) return header_expr; } -void compile_grep_patterns(struct grep_opt *opt) +static void compile_grep_patterns_real(struct grep_opt *opt) { struct grep_pat *p; struct grep_expr *header_expr = prep_header_patterns(opt); @@ -432,9 +513,16 @@ void compile_grep_patterns(struct grep_opt *opt) else opt->pattern_expression = grep_or_expr(opt->pattern_expression, header_expr); + opt->all_match = 1; } +void compile_grep_patterns(struct grep_opt *opt) +{ + compile_grep_patterns_real(opt); + dump_grep_expression(opt); +} + static void free_pattern_expr(struct grep_expr *x) { switch (x->node) { -- 1.7.12.414.g1a62b7a -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Michael J Gruber writes: > It was introduced in 0ab7befa with a clear meaning (AND everything), > then the general logic (without --all-match) was modified in 80235ba7 > (to take headermatch AND (all greps ORed)), and 5aaeb733 finally made > multiple authors resp. committers get ORed among each other. All of this > in an attempt to make the standard usage most useful, of course. As a > consequence, --all-match does not influence multiple --author options at > all (contrary to the doc), e.g. > > I don't see any of this reflected in the doc, though. I noticed only by > reading t/t7810-grep.sh. Before that, I had only gone by my own testing > which didn't reveal the multiple author multiple special casing effect. > > I guess I'll have to wrap my head around the current implementation a > few more times before trying to describe the status quo in the > documentation... This is what I used to use when adding these generalized grep boolean expressions. With this applied, you can try things like these: $ git log --grep=regexp --grep=nosuch --all-match >/dev/null $ git log --grep=regexp --grep=nosuch --author=Michael >/dev/null For example, "--all-match --grep=regexp --author=Michael --author=Linus" turns into [all-match] (or pattern_bodyregexp (or (or pattern_headLinus pattern_headMichael ) true ) ) that says "body must have 'regexp' in it, and authored by either Linus or Michael". The semantics of "--all-match" is different from "--and" (which, together with "--or", ")", and "(", is not availble to rev-list command line parser primarily because "--not" is not available). After applying all the "or"-ed terms, it checks the top-level nodes that are "or"-ed and makes sure all of them fired (possibly and usually on different lines). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Junio C Hamano venit, vidit, dixit 11.09.2012 18:22: > Michael J Gruber writes: > >> The current behavior is probably as useful as it is confusing. In any >> case it is going to stay. So, document it. >> >> Signed-off-by: Michael J Gruber >> --- >> I would have written a test but don't really know where to stick it in. >> rev-list has many small tests where it doesn't fit. > > "git log" would be the more natural place, I would say. Well, to me, "git log" is the ui to "git rev-list", so everything which tests the revision walker should be tested using "git rev-list". But I don't care as long as it fits in - and can be found later. That being said, I now see lots of "log --grep" and "log --author" tests in the grep-test. So I guess there it fits best (despite the misnomer). >> >> Documentation/rev-list-options.txt | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/Documentation/rev-list-options.txt >> b/Documentation/rev-list-options.txt >> index 5436eba..9c13df3 100644 >> --- a/Documentation/rev-list-options.txt >> +++ b/Documentation/rev-list-options.txt >> @@ -6,6 +6,12 @@ special notations explained in the description, additional >> commit >> limiting may be applied. Note that they are applied before commit >> ordering and formatting options, such as '--reverse'. >> >> +All occurrences of the same option are ORed: '--grep=foo --grep=bar' >> +limits to commits which match at least one of these conditions. > > How would this interact with "--all-match"? Badly. It was introduced in 0ab7befa with a clear meaning (AND everything), then the general logic (without --all-match) was modified in 80235ba7 (to take headermatch AND (all greps ORed)), and 5aaeb733 finally made multiple authors resp. committers get ORed among each other. All of this in an attempt to make the standard usage most useful, of course. As a consequence, --all-match does not influence multiple --author options at all (contrary to the doc), e.g. I don't see any of this reflected in the doc, though. I noticed only by reading t/t7810-grep.sh. Before that, I had only gone by my own testing which didn't reveal the multiple author multiple special casing effect. I guess I'll have to wrap my head around the current implementation a few more times before trying to describe the status quo in the documentation... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rev-list/log: document logic with several limiting options
Michael J Gruber writes: > The current behavior is probably as useful as it is confusing. In any > case it is going to stay. So, document it. > > Signed-off-by: Michael J Gruber > --- > I would have written a test but don't really know where to stick it in. > rev-list has many small tests where it doesn't fit. "git log" would be the more natural place, I would say. > > Documentation/rev-list-options.txt | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/Documentation/rev-list-options.txt > b/Documentation/rev-list-options.txt > index 5436eba..9c13df3 100644 > --- a/Documentation/rev-list-options.txt > +++ b/Documentation/rev-list-options.txt > @@ -6,6 +6,12 @@ special notations explained in the description, additional > commit > limiting may be applied. Note that they are applied before commit > ordering and formatting options, such as '--reverse'. > > +All occurrences of the same option are ORed: '--grep=foo --grep=bar' > +limits to commits which match at least one of these conditions. How would this interact with "--all-match"? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] rev-list/log: document logic with several limiting options
The current behavior is probably as useful as it is confusing. In any case it is going to stay. So, document it. Signed-off-by: Michael J Gruber --- I would have written a test but don't really know where to stick it in. rev-list has many small tests where it doesn't fit. Documentation/rev-list-options.txt | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 5436eba..9c13df3 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -6,6 +6,12 @@ special notations explained in the description, additional commit limiting may be applied. Note that they are applied before commit ordering and formatting options, such as '--reverse'. +All occurrences of the same option are ORed: '--grep=foo --grep=bar' +limits to commits which match at least one of these conditions. + +Different options are ANDed: '--author=bar --grep=foo' +limits to commits which match both conditions. + -- -n 'number':: -- 1.7.12.463.gbd9d638 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html