On Mon, Nov 06, 2017 at 12:50:45PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Some replies to the thread in general, didn't want to spread this out
> into different replies.
> 
>  * Yes this sucks.
> 
>  * Just emitting a warning without an appropriate exit code would suck
>    more, would break batch jobs & whatnot that expcept certain results
>    from grep.

That was my first thought, too, but something that does:

  git grep foo && echo found

would behave basically the same. Do you mean here scripts that actually
do:

  git grep foo
  case "$?" in
  0) echo found ;;
  1) echo not found ;;
  *) echo wtf? ;;
  esac

I agree it would be nice to at least have _some_ way to distinguish
between those final two cases.

Though something like "git log --grep" is even more complicated. We
perhaps _would_ want to distinguish between:

  - match (in which case we print the commit)

  - no match (in which case we do not)

  - error (in which case we do not print, but also mark the exit code as
    failing)

>  * As you point out it would be nice to print out the file name we
>    didn't match on, we'd need to pass the grep_source struct down
>    further, it goes as far as grep_source_1 but stops there and isn't
>    passed to e.g. look_ahead(), which calls patmatch() which calls the
>    engine-specific matcher and would need to report the error. We could
>    just do this, would slow down things a bit (probably trivally) but we
>    could emit better error messages in genreal.

I'm not sure if the grep_source has enough information for all cases.
E.g., if you hit an error while grepping in commit headers, you'd
probably want to mention the oid of the commit. There's an "identifier"
field in the grep_source, but it's opaque.

The caller may also want to do more things than just print an error
(like the exit code adjustment I mentioned above). Which implies to me
we should be passing the error information up, not trying to bring the
context down.

>  * You can adjust these limts in PCRE in Git, although it doesn't help
>    in this case, you just add (*LIMIT_MATCH=NUM) or
>    (*LIMIT_RECURSION=NUM) (or both) to the start of the pattern. See
>    pcresyntax(3) or pcre2syntax(3) man pages depending on what version
>    you have installed.

I saw that in the pcre manual, but I had the impression you can't
actually raise the limits above the defaults with that, only lower them.

>  * While regexec() won't return an error its version of dealing with
>    this is (at least under glibc) to balloon CPU/memory use until the
>    OOMkiller kills git (although not on this particular pattern).

So in a sense our current behavior with pcre is the same. We just have
to provoke the death ourselves. ;)

-Peff

Reply via email to