Pavel Volkov wrote: > Brian wrote: > > The call for release goals has finished and we have received the > > following proposals: > > > > * UTF-8 > > What's wrong with UTF-8 currently?
fmt: incorrect formatting of UTF-8 text http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=650381 tr: fails to replace umlauts http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=388689 tr fails with UTF-8 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=431231 _CTYPE with UTF-8 doesn't work correctly http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=139861 tr cannot handle unicode http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=613155 uniq: merges obscure Cyrillic characters http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649729 I am sure there is more. Coreutils is probably the worse off because all of the patches to address the problem have been deemed not to be maintainable and so have been rejected. For some reason tr seems to catch the worse of the notice but all of the coreutils basically have the same issue in that they handle byte size characters only. And along with coreutils there are bound to be other programs that are similarly designed for single byte characters. Bob
signature.asc
Description: Digital signature