On Mon, Feb 17, 2014 at 6:18 AM, Paolo Bonzini <[email protected]> wrote: > The correct course of action for grep is to defer range interpretation > to regex, because otherwise you can get mismatches between regexes with > backreferences and those without. > > For example, [A-Z]. will use RRI but ([A-Z])\1 won't, with the confusing > result that the first regex won't match a superset of the language > described by the second regex. > > The source of the confusion is that, even though grep's dfa.c was changed > to use range checking instead of strcoll, that code is only invoked if > dfaexec is called with backref = NULL, and that never happens for grep! > > In the end, all that's needed for RRI is compiling --with-included-regex, > and in that case the patch is almost a no-op. Almost, because there > are corner cases that aren't handled correctly (e.g. [a-[.e.]], or > regular expressions that include a NUL character), but this can be > handled separately. > > * NEWS: Revert paragraph introduced by commit 1078b64302. > * src/dfa.c (parse_bracket_exp): Revert back to regcomp/regexec. > > Signed-off-by: Paolo Bonzini <[email protected]>
Thanks. I have applied that (and pushed) with two log message changes: I removed the Signed-off-by line (redundant when same as "Author:"), and replaced 1078b64302 with v2.16-7-g1078b64.
