On Sun, Jul 17, 2011 at 04:48:59PM -0400, Ted Unangst wrote: > On Sun, Jul 17, 2011, Matthias Kilian wrote: > > > Then those ports should be fixed. There seem to be more GNUisms in > > (recent?) GNU grep that are picked up by projects, for example the > > use of \s and \S in pxltoraster (currently a disabled part of > > ghostscript, for which I've got a diff and waiting for some more > > test results). > > > > I understand that \<...\> is quite convenient, but where's the line > > between convenience and feature bloat? > > ooo, maybe I can add \s too. :) > > I don't know that there's a good answer to give here. I even think a > little about putting such things in the libc regcomp, but that seems > somewhat riskier. Then again, to quote the re_format man page, "The > syntax for word boundaries is incredibly ugly." > > I don't think we really want to emulate all of pcre necessarily, but > that is what people think of when they here "you can enter a regular > expression here" because all the extra \escapes are what's offered by > pcre/perl/python/ruby/javascript/you name it. And they are mostly > backwards compatible with extended REs. > > posix does say "The interpretation of an ordinary character preceded by > a backslash ( '\' ) is undefined." for both BREs and EREs, so adding > additional \escapes cannot cause trouble for a properly written regex. > > Fun fact about posix: It doesn't specify [[:<:]] or -w. So a 100% > posix grep is incapable of matching word boundaries at all. I can hear > the screaming now if somebody proposed being strictly conformant. > > Regular expressions are a serious shortcoming in posix. EREs don't even > have backrefs, you have to use dinosaur syntax BREs. How silly is that?
none of this reads like a good reason to add it though. a mechanism for detecting word boundaries already exists, and the closest thing there is to a law (posix me bad!) doesn;t give a flying fuck. regarding your fun fact: just because posix doesn;t support something, it doesn;t make an attempt to adhere to posix inherently wrong. i'm inclined to say that if we support it we should document it. and in this case that means adding (and documenting) this in the libc stuff, not grep. jmc