-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 According to Patrik Hirvinen on 7/9/2006 8:48 AM: > Hi, > > This bug was found on an Ubuntu 5.10 GNU/Linux x86 using cut version > 5.2.1. Locale used was en_US.UTF-8. > > When fed text that includes multi-byte characters, cut makes the > assumption that one byte corresponds to one character, even though the > locale would clearly suggest otherwise.
Unfortunately, no one has yet submitted a clean implementation of multi-byte handling to upstream coreutils, so it is a known deficiency that the bulk of coreutils' text utilities do not understand multibyte characters. Would you care to help by writing a patch? If so, use this list as a springboard for discussion; Jim already has several requirements for what a multibyte implementation must do before he will incorporate it. - -- Life is short - so eat dessert first! Eric Blake [EMAIL PROTECTED] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.1 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEskZH84KuGfSFAYARAiVdAJ9dzP+3EBD/e8Ng03+RyBrLnUGjQQCfXlKA fjBSqZvJwrIc99Bu2wAYkI0= =CmWn -----END PGP SIGNATURE----- _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils