On Tue, Feb 26, 2008 at 9:24 PM, erik quanstrom <[EMAIL PROTECTED]> wrote:
>
>  i think the comments about this problem are missing the point
>  a bit.  utf8 should be transparent to awk unless the situation demands

No. It is not transparent at all. It is semitranslucid because someone did it
partways and because of that I have been bitten hardly by this in different
situations (I am not complaining, just saying that this may not be the right
approach to take in the future).

What someone did is make it so:
/a.j/
matches
a☺j
because someone fixed the regexp part of awk somehow it already understands this
which made me (falsely) think originally that it works and conned me
into the bug.

There is split and other functions,
for example:

toupper("aí")
gives
Aí

My guess is that there are many more little (or not) corners where it
doesn't work.
We can go on and on looking for crevices and hiding the bugs further
under the rug
so that they are not evident and find everyone completely unaware,
leave awk as it is now or really fix the problem. The first approach
doesn't work. I am going to take
the second till I have time to take the third which means use runes or
at least revise all the
code so that it is uniformly aware of the existance of non-ascii characters.
-- 
- curiosity sKilled the cat

Reply via email to