Re: [PATCH] grep: Add option --max-line-len
[second try, now with text format] Thanks a lot for the reviews. Replying to both. If I send a follow up, I'll fix the commit description and the help string, remove the shorthand -M, write a more sensible test. 2017-11-23 14:24 GMT-05:00 Eric Sunshine <sunsh...@sunshineco.com>: >> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt >> @@ -127,6 +128,10 @@ OPTIONS >> +-M:: >> +--max-line-len:: >> + Match the pattern only for line shorter or equal to this length. > > This documentation doesn't explain what it means by a line's length. > This implementation seems to take into consideration only the line's > byte count, however, given that displayed lines are normally > tab-expanded, people might intuitively expect that expansion to count > toward the length. A similar question arises for Unicode characters. > > Should this option take tab-expansion and Unicode into account? > Regardless of the answer to that question, the documentation should > make it clear what "line length" means. Excellent question. I don't have an immediate answer. >> diff --git a/grep.c b/grep.c >> @@ -1273,6 +1275,8 @@ static int match_line(struct grep_opt *opt, char *bol, >> char *eol, >> + if (opt->max_line_length > 0 && eol-bol > opt->max_line_length) >> + return 0; > > If the user specifies "-M0", should that error out or at least warn > the user that the value is non-sensical? What about -1, etc.? (These > are UX-related questions; the implementation obviously doesn't care > one way or the other.) Precedent with -A is to ignore the negative value. I don't have a strong opinion. 2017-11-23 20:44 GMT-05:00 Junio C Hamano <gits...@pobox.com>: > > Marc-Antoine Ruel <mar...@chromium.org> writes: > > > This tells git grep to skip files longer than a specified length, > > which is often the result of generators and not actual source files. > > > > ... > > +-M:: > > +--max-line-len:: > > + Match the pattern only for line shorter or equal to this length. > > + > > All the excellent review comments from Eric I agree with. > > With the name of the option and the above end-user facing > description, it is very clear that the only thing this feature does > is to declare that an overlong line does _not_ match when trying to > check against any pattern. > > That is a much clearer definition and description than random new > features people propose here (and kicked back by reviewers, telling > them to make the specification clearer), and I'd commend you for that. > > But it still leaves at least one thing unclear. How should it > interact with "-v"? If we consider an overlong line never matches, > would "git grep -v " should include the line in its output? Ah! No idea. :/ > Speaking of the output, it also makes me wonder if the feature > really wants to include an overlong line as a context line when > showing a near-by line that matches the pattern when -A/-B/-C/-W > options are in use. Even though it is clear that it does from the > above description, is it really the best thing the feature can do to > help the end users? > > Which leads me to suspect that this "feature" might not be the ideal > you wanted to achive, but is an approximate substitution that you > found is "good enough" to simulate what the real thing you wanted to > do, especially when I go back and read the justfication in the > proposed log message that talks about "result of generators". > > Isn't it a property of the entire file, not individual lines, if you > find it uninteresting to see reported by "git grep"? I cannot shake > the suspicion that this feature happened to have ended up in this > shape, instead of "ignore a file with a line this long", only > because your starting point was to use "has overlong lines" as the > heuristic for "not interesting", and because "git grep" code is not > structured to first scan the entire file to decide if it is worth > working on it, and it is extra work to restructure the codeflow to > make it so (which you avoided). > > If your real motivation was either > > (1) whether the file has or does not have the pattern for certain > class of files are uninteresting; do not even run "grep" > processing for them; or > > (2) hits or no-hits may be intereseting but output of overlong > lines from certain class of files I do not wish to see; > > then I can think of two alternatives. > > For (1), can't we tell "result of generators" and other files with > pathspec? If so, perha
[PATCH] grep: Add option --max-line-len
This tells git grep to skip files longer than a specified length, which is often the result of generators and not actual source files. Signed-off-by: Marc-Antoine Ruel <mar...@chromium.org> --- Documentation/git-grep.txt | 5 + builtin/grep.c | 2 ++ grep.c | 4 grep.h | 1 + t/t7810-grep.sh| 5 + 5 files changed, 17 insertions(+) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 18b494731..75081defb 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -10,6 +10,7 @@ SYNOPSIS [verse] 'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp] + [-M | --max-line-len ] [-v | --invert-match] [-h|-H] [--full-name] [-E | --extended-regexp] [-G | --basic-regexp] [-P | --perl-regexp] @@ -127,6 +128,10 @@ OPTIONS beginning of a line, or preceded by a non-word character; end at the end of a line or followed by a non-word character). +-M:: +--max-line-len:: + Match the pattern only for line shorter or equal to this length. + -v:: --invert-match:: Select non-matching lines. diff --git a/builtin/grep.c b/builtin/grep.c index 5a6cfe6b4..cc5c70be5 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -796,6 +796,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) N_("case insensitive matching")), OPT_BOOL('w', "word-regexp", _regexp, N_("match patterns only at word boundaries")), + OPT_INTEGER('M', "max-line-len", _line_length, + N_("ignore lines longer than ")), OPT_SET_INT('a', "text", , N_("process binary files as text"), GREP_BINARY_TEXT), OPT_SET_INT('I', NULL, , diff --git a/grep.c b/grep.c index d0b9b6cdf..881078b82 100644 --- a/grep.c +++ b/grep.c @@ -36,6 +36,7 @@ void init_grep_defaults(void) opt->relative = 1; opt->pathname = 1; opt->max_depth = -1; + opt->max_line_length = -1; opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED; color_set(opt->color_context, ""); color_set(opt->color_filename, ""); @@ -151,6 +152,7 @@ void grep_init(struct grep_opt *opt, const char *prefix) opt->pattern_type_option = def->pattern_type_option; opt->linenum = def->linenum; opt->max_depth = def->max_depth; + opt->max_line_length = def->max_line_length; opt->pathname = def->pathname; opt->relative = def->relative; opt->output = def->output; @@ -1273,6 +1275,8 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol, struct grep_pat *p; regmatch_t match; + if (opt->max_line_length > 0 && eol-bol > opt->max_line_length) + return 0; if (opt->extended) return match_expr(opt, bol, eol, ctx, collect_hits); diff --git a/grep.h b/grep.h index 399381c90..0e76c0a19 100644 --- a/grep.h +++ b/grep.h @@ -151,6 +151,7 @@ struct grep_opt { int null_following_name; int color; int max_depth; + int max_line_length; int funcname; int funcbody; int extended_regexp_option; diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 2a6679c2f..c514bd388 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -766,6 +766,11 @@ test_expect_success 'grep -W shows no trailing empty lines' ' test_cmp expected actual ' +test_expect_success 'grep skips long lines' ' + git grep -M18 -W include >actual && + test_cmp expected actual +' + cat >expected <
git-svn fails to initialize with submodule setup, when '.git' is a file and not a directory
Repro: 1. git clone --recursive a repository with submodules. 2. cd checkout/submoduleA 3. git svn init svn://host/repo Expected: Works as usual Actual: /path/to/checkout/submoduleA/.git/refs: Not a directory init: command returned error: 1 Why: submoduleA/.git is a file, not a directory. $ cat /path/to/checkout/submoduleA/.git gitdir: ../.git/modules/submoduleA $ git --version git version 1.8.4 Thanks, M-A -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html