Bug#555922: libc6: UTF-8 decoding is not conforming to the Unicode standard

2009-11-12 Thread Dmitri Gribenko
Hi, Here's another example, a rather nasty one. find(1) also uses libc's regexps: $ mkdir test $ cd test $ ls $ touch $(printf 'aaa\x80bbb') aaa1bbb $ ls aaa1bbb aaa?bbb $ find . -regex '^.+bbb' ./aaa1bbb 'aaa\x80bbb' not found. Best regards, Dmitri Gribenko -- main(i,j){for(i=2;;i++){for(j

Bug#555922: libc6: UTF-8 decoding is not conforming to the Unicode standard

2009-11-12 Thread Dmitri Gribenko
Package: libc6 Version: 2.10.1-5 Severity: normal libc's decoding of UTF-8 is not conforming to the Unicode standard. In particular, it processes: * 5 and 6-byte sequences, that are not described in the Unicode standard. * 4-byte sequences that decode to code points above U+10. * surrogates