Heavy grep users may also be interested in "ack".

http://search.cpan.org/dist/ack/ack

Adam K

Malcolm Johnston wrote:
Most of us know that, way back when, and Ken Thompson still had a black beard, that there were three basic version of "grep", with prefixed or flags that turned them on a such, but not fully integrated version of this tool that would work as quickly as the three versions. This, I think, went by the board, sometime ago, what with faster processors, DFA-type algorithms and the like. Now we seem to have mostly one, copied into it's various destinations by the squanders, or symlinked by the thrifty. What the hell! It's all gotten so much bigger and faster, so why bother: the toolbox approach was alright for tradesman, who actually had toolboxes, but for the rest....

I discovered this, a decade or so ago, when an out-of-the-box distribution ran (very signifcantly more slowly) that equivalent pattern-matchers in "awk" and "perl". The problem was easy enough to fix, it just involved resetting the "$LANG" variable in the shell to "C" or "POSIX". The current "en_US" setting produces a much more attenuated problem of the one described above, and isn't worth worrying about unless, as I do (I'm a linguist) you use "*grep" repetetively, where it surges once more into prominence. The actual culprit is the "as-shipped `fgrep', which has a very curious conception of what a word is, unless it is operating in the right locale. I haven't bothered to localize this exactly, but I know from "strace" that many processes do a fair bit of locale-checking on their way to execution. Given that English as a mother-tongue is the fourth-most spoken language on the planet, and as a second (and, in many case, semi-bilingual setting) is spoken by more than 1 billion people, a great many of whom do not speak or write American dialects of English, maybe the developers of "*grep" should take this into account.

I personally solved the problem by replacing my sym-linked "fgrep" with a far-older (yet fully functional) version. Maybe I should forward this one as a "bugs" report, although it's been a bug for years. Maybe we should all talk POSIX (I have certain professional doubts about that). Search lists, the "-f" option, is not, I think, behaving nicely.

Cheers,
Malcolm Johhston
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to