in this case yes indeed, my mblen() function posted some days ago could be used to prevent display of cutted char series.
The real problem with unicode is utf-16 which contains \0 chars (but its another and uncommon problem) 2014-08-13 15:17 GMT+02:00 Harald Becker <ra...@gmx.de>: > Hi ! > > > > if cut fields supports strings bigger than a single char, there > > should be no problem, the serie is found in input text. > > $ echo -n äöü | hd > 00000000 c3 a4 c3 b6 c3 bc > > $ echo -n äöü | cut -c1 | hd > 00000000 c3 0a > > $ echo -n äöü | cut -c2 | hd > 00000000 a4 0a > > This shows the position given with cut -c does not pick the correct > character. BB same as upstream. > > cut has a -b option to specify the byte position, but -c is called to use > character positions. So I expect either -c1 (when counted from zero) or -c2 > (when counted from one) to omit the "ö" (oumlaut) from the input text. > > -- > Harald > >
_______________________________________________ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox