[email protected] wrote:
The bugaboo here is the "---"; it's
a range expression consisting of minus through minus, and apparently long
ago was how one got a minus into a bracket expression.
Actually, long ago expressions like '[^0-9-]' worked just as they do now, and it
wasn't ever necessary to use trailing "---". That being said, it is true that
in 7th Edition Unix '[^0-9---]' meant the same thing as '[^0-9-]', so in that
sense we have an incompatibility with 7th Edition Unix here.
$ ./src/grep '[^0-9---]' /dev/null
./src/grep: Invalid range end
The underlying regex and, I believe, dfa routines don't accept this.
Yes, that's correct. It's not a bug, though, as the regexp is ambiguous and
does not conform to POSIX, which says the following about RE bracket
expressions: "To use a <hyphen> as the starting range point, it shall either
come first in the bracket expression or be specified as a collating symbol; for
example, "[][.-.]-0]", which matches either a <right-square-bracket> or any
character or collating element that collates between <hyphen> and 0, inclusive."
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05>
In your correspondent's example, the hyphen is a starting range point but is
neither first in the bracket expression nor is specified as a collating symbol,
so the regexp doesn't conform to POSIX.
Even though it's not a bug I suppose it wouldn't hurt to make the GNU matchers
compatible with 7th Edition Unix here, if someone really wants to take that task
on; it's not urgent, though.