I want to point out that nobody should be using non-validating UTF-8 converters; not only are they not conformant to the Unicode standard, they also represent a security risk.
Mark (ᛗᚪᚱᚳ) ________ [EMAIL PROTECTED] IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Simon Josefsson" <[EMAIL PROTECTED]> To: "Martin v. Löwis" <[EMAIL PROTECTED]> Cc: "Paul Hoffman / IMC" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, March 28, 2003 21:06 Subject: [idn] Re: Fwd: I-D ACTION:draft-josefsson-idn-test-vectors-00.txt > [EMAIL PROTECTED] (Martin v. Löwis) writes: > > > Paul Hoffman / IMC <[EMAIL PROTECTED]> writes: > > > >> There is a new draft that is of very direct interest to this mailing > >> list. It would be great if people who have implemented IDNA could > >> check their results against this draft. > > > > Please correct me if I'm wrong: I believe the UTF-8 strings are wrong > > in a number of tests: > > > > 4.3: U+00DF should be \xc3\x9f, not \xc3\xdf > > 4.9: U+01F0 should be \xc7\xb0, not \xc7\xf0 > > 4.44: U+00DF should be \xc3\x9f, not \xc3\xdf > > 4.45: Likewise. > > You are right. I thought I could convert simple strings by hand, but > obviously I didn't select the proper UTF-8 encoding (in some cases). > I believe you catched all errors. Since those examples test whether > the application uses a validating UTF-8 decoder, I will keep those > examples modified to result in an UTF-8 decoding error. > Non-validating UTF-8 decoders should produce the correct Unicode code > point for those UTF-8 encodings though. > > >
