None of those issues are specific to AWK, they apply just as well to sed(1) or any program dealing with regexps. I think the plan9 tools demonstrate that it is not so hard to find a 'good enough' solution; and the lunix locale debacle demonstrate that if you want to get it 'right' you will end up with a nightmare.
The problem with awk is that it is not a native plan9 app, and it simian nature shows in too many places. For example system() and | are badly broken: % echo |awk '{print |"echo $KSH_VERSION"}' @(#)PD KSH v5.2.14 99/07/13.2 Boyd made a native port of awk that fixed most (all?) of this issues, it can be found somewhere in his contrib dir but I don't think is production-ready. uriel On Wed, Feb 27, 2008 at 4:54 PM, Sape Mullender <[EMAIL PROTECTED]> wrote: > > There is split and other functions, > > for example: > > > > toupper("aí") > > gives > > Aí > > > > My guess is that there are many more little (or not) corners where it > > doesn't work. > > Yes, and then there is locale: does [a-z] include ij when you run it > in Holland (it should)? Does it include á, è, ô in France (it should)? > Does it include ø, å in Norway (it should not)? And what happens when > you evaluate "è" < "o" (it depends)? > > Fixing awk is much harder than anyone things. I had a chat about it with > Brian Kernighan and he says he's been thinking about fixing awk for a > long time, but that it really is a hard problem. > > Sape > >