On Sun, Apr 22, 2018 at 07:14:58PM -0400, Eric Sunshine wrote:
> On Sun, Apr 22, 2018 at 4:47 PM, Taylor Blau <[email protected]> wrote:
> > In a subsequent patch, we teach show_line() to optionally include the
> > column number of the first match on each matched line.
> >
> > The regmatch_t involved in match_line() and match_one_pattern() both
> > contain this information (via regmatch_t->rm_so), but their current
> > implementation throws this stack variable away at the end of the call.
> >
> > Instead, let's teach match_line() to take in a 'regmatch_t *' so that
> > callers can inspect the result of their calls. This will prove useful in
> > a subsequent commit when a caller will forward on information from the
> > regmatch_t into show_line (as described above).
> >
> > The return condition remains unchanged, therefore the only change
> > required of callers is the addition of a single argument.
>
> Is 'rm_so' the only piece of information which callers will ever want
> to extract from 'regmatch_t'? If so, this patch's approach might be
> overly broad, unnecessarily exposing too much of match_lines()'s
> internal implementation. An alternative would be to narrow the
> interface and limit exposure by passing in an 'int *matched_col' or
> some such.
>
> (Not necessarily a big deal, but something to consider.)
Thanks; I do not think that this is overly broad, although I suppose
that it does bind us to the implementation of regmatch_t. The definition
of regmatch_t, however, is fairly minimal:
typedef struct
{
regoff_t rm_so; /* Byte offset from string's start to
substring's start. */
regoff_t rm_eo; /* Byte offset from string's start to
substring's end. */
} regmatch_t;
I can imagine that other callers would be interested in the other piece
of information, 'rm_eo', as well. Therefore, I think it makes sense to
provide access to it along with 'rm_so'.
I think that if we are concerned with being bound to regmatch_t, then we
could typedef a new struct that contains a 'start' and 'end' unsigned,
but I think that this extra effort would not be fruitful, since the
majority of grep.c assumes regmatch_t (and friends).
> > Signed-off-by: Taylor Blau <[email protected]>
> > ---
> > diff --git a/grep.c b/grep.c
> > @@ -1299,17 +1299,17 @@ static int match_expr(struct grep_opt *opt, char
> > *bol, char *eol,
> > static int match_line(struct grep_opt *opt, char *bol, char *eol,
> > - enum grep_context ctx, int collect_hits)
> > + regmatch_t *match, enum grep_context ctx,
> > + int collect_hits)
> > {
> > struct grep_pat *p;
> > - regmatch_t match;
> >
> > if (opt->extended)
> > return match_expr(opt, bol, eol, ctx, collect_hits);
>
> The new 'match' argument has no impact in the 'opt->extended' case.
> Perhaps this deserves calling out in the commit message.
I have noted this in the commit message, which I'll include in the next
re-roll.
Thanks,
Taylor