None of those issues are specific to AWK, they apply just as well to
sed(1) or any program dealing with regexps. I think the plan9 tools
demonstrate that it is not so hard to find a 'good enough' solution;
and the lunix locale debacle demonstrate that if you want to get it
'right' you will end up with a nightmare.
The problem with awk is that it is not a native plan9 app, and it
simian nature shows in too many places. For example system() and | are
badly broken:
% echo |awk '{print |"echo $KSH_VERSION"}'
@(#)PD KSH v5.2.14 99/07/13.2
Boyd made a native port of awk that fixed most (all?) of this issues,
it can be found somewhere in his contrib dir but I don't think is
production-ready.
uriel
On Wed, Feb 27, 2008 at 4:54 PM, Sape Mullender
<[EMAIL PROTECTED]> wrote:
> > There is split and other functions,
> > for example:
> >
> > toupper("aí")
> > gives
> > Aí
> >
> > My guess is that there are many more little (or not) corners where it
> > doesn't work.
>
> Yes, and then there is locale: does [a-z] include ij when you run it
> in Holland (it should)? Does it include á, è, ô in France (it should)?
> Does it include ø, å in Norway (it should not)? And what happens when
> you evaluate "è" < "o" (it depends)?
>
> Fixing awk is much harder than anyone things. I had a chat about it with
> Brian Kernighan and he says he's been thinking about fixing awk for a
> long time, but that it really is a hard problem.
>
> Sape
>
>