Eric Sunshine <[email protected]> wrote:
> On Mon, May 30, 2016 at 7:21 PM, Eric Wong <[email protected]> wrote:
> > diff --git a/pretty.c b/pretty.c
> > @@ -1697,12 +1699,34 @@ static void pp_handle_indent(struct
> > pretty_print_context *pp,
> > +static regex_t *mboxrd_prepare(void)
> > +{
> > + static regex_t preg;
> > + const char re[] = "^>*From ";
> > + int err = regcomp(&preg, re, REG_NOSUB | REG_EXTENDED);
> > +[...]
> > + return &preg;
> > +}
> > +
> > void pp_remainder(struct pretty_print_context *pp,
> > const char **msg_p,
> > struct strbuf *sb,
> > int indent)
> > {
> > + static regex_t *mboxrd_from;
> > +
> > + if (pp->fmt == CMIT_FMT_MBOXRD && !mboxrd_from)
> > + mboxrd_from = mboxrd_prepare();
> > +
> > @@ -1725,8 +1749,13 @@ void pp_remainder(struct pretty_print_context *pp,
> > else if (pp->expand_tabs_in_log)
> > strbuf_add_tabexpand(sb, pp->expand_tabs_in_log,
> > line, linelen);
> > - else
> > + else {
> > + if (pp->fmt == CMIT_FMT_MBOXRD &&
> > + !regexec(mboxrd_from, line, 0, 0,
> > 0))
> > + strbuf_addch(sb, '>');
>
> At first glance, this seems dangerous since it's handing 'line' to
> regexec() without paying attention to 'linelen'. For an arbitrary
> regex, this could result in erroneous matches on subsequent "lines",
> however, since this expression is anchored with '^', it's not a
> problem. But, it is a bit subtle.
Maybe having more context of the pp_remainder function and
seeing the get_one_line call would've helped in the diff;
I didn't think of this issue once I figured out where to
place the change.
On the other hand, not being too familiar with git C APIs, I was
more worried strbuf was not NUL-terminated for regexec, but it
seems to be.
> I wonder if hand-coding, rather than using a regex, could be an improvement:
>
> static int is_mboxrd_from(const char *s, size_t n)
> {
> size_t f = strlen("From ");
> const char *t = s + n;
>
> while (s < t && *s == '>')
> s++;
> return t - s >= f && !memcmp(s, "From ", f);
> }
>
> or something.
Yikes. I mostly work in high-level languages and do my best to
avoid string parsing in C; so that scares me. A lot.
I admit a regex isn't necessary, but I'm wondering if the above
could be made less frightening to someone like me.
Or maybe I'm just easily frightened :x
Maybe extra test cases + valgrind can quell my fears :)
> > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> > @@ -1565,4 +1565,31 @@ test_expect_success 'format-patch --base overrides
> > format.useAutoBase' '
> > +test_expect_success 'format-patch --pretty=mboxrd' '
> > + cat >msg <<-INPUT_END &&
>
> Maybe use <<-\INPUT_END to indicate that no variable interpolation is
> expected. Ditto below.
Noted, though I may add variable interpolation to test a line
with trailing whitespace (just "From ") to ensure we don't
overrun a buffer.
> Hmm, -A is not POSIX and is otherwise not used in Git tests. Perhaps
> you could use 'git grep --no-index -A' instead or something?
Noted and will fix for v2. Will wait a day or two for
further comments on this series.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html