ive seen several implementations which use mbtowc functions to test some
special chars, this is not correct for utf 8 in my opinion.

To count the number of UTF-8 characters is really simple, just count all bytes except those with value in range 0x80 to 0xBF. This has two exceptions 0xFE and 0xFF which are no official UTF-8 characters, but I think it's not wrong to count and behave as such.


counting can be done with one logical an one compare instruction:

if ((c ^ 0x40) < 0xC0) n++


_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to