On Sat, 3 Nov 2001, Eli Zaretskii wrote: > > ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO-4.html > > This is still silent about Grep, Sort, and tr, which are > the utilities where the non-ASCII support should be a non-trivial > change. > > Basically, even after reading that page (which told me something I > didn't know in some cases), Unicode support in basic development > tools is still very much rudimentary.
In practice, Perl has long ago replaced grep, sort, tr, awk, for all but sentimental reasons. Most of these little silly things were written as inefficient separate C processes before 1975 for the sole reason that the PDP-11 that Ritchie and Thompson used had only 64 kB RAM and couldn't handle any larger multi-function tools: http://www.bell-labs.com/history/unix/ http://www.bell-labs.com/history/unix/firstport.html Today, these tiny tools mostly lead people to write extremely inefficient shell scripts that spend 90% of their time in fork(). UTF-8 support for Perl is in an advanced state, and for some more experienced UTF-8 users, "grep", "sort", "tr", etc. are merely convenient and nostalgic shell functions or scripts that call perl to do the job. [I sometimes wish, we could give up the classic Bourne-style shell with it's baroque Algol-inspired syntax entirely and that perl had the few facilities (e.g., prompts, readline-history, compact command-invocation/argv/piping/redirecting notation, etc.) that are still missing before we can turn it into the main command-line shell.] Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>