2013/1/22 Peter A. Shevtsov <petr.shevt...@gmail.com>:
> On 22/01/13 at 02:32pm, Peter A. Shevtsov wrote:
>
>> It seems that it counts every cyrillic letter as two, i. e. it ain't count 
>> letters
>> (or runes) but bytes.
>
> Indeed,
>
> echo latin кириллица | /usr/local/plan9/bin/awk '{printf("%d %d\n", 
> length($1),
> length($2))}'
>
> 5 18
>

Also, awk can't know beforehand if the input string is UTF-8 encoded
or not, so the only thing it can do is to count bytes....

Reply via email to