2011/5/16 Boris Zbarsky <bzbar...@mit.edu>: > On 5/16/11 4:37 PM, Mike Samuel wrote: >> >> You might have. If you reject my assertion about option 2 above, then >> to clarify, >> The UTF-16 representation of codepoint U+10000 is the code-unit pair >> U+D8000 U+DC000. > > No. The UTF-16 representation of codepoint U+10000 is the code-unit pair > 0xD800 0xDC00. These are 16-bit unsigned integers, NOT Unicode characters > (which is what the U+NNNNN notation means).
My apologies for abusing notation. >> The UTF-16 representation of codepoint U+D8000 is the single code-unit >> U+D8000 and similarly for U+DC00. > > I'm assuming you meant U+D800 in the first two code-units there. yes > There is no Unicode codepoint U+D800 or U+DC00. See > http://www.unicode.org/charts/PDF/UD800.pdf and > http://www.unicode.org/charts/PDF/UDC00.pdf which clearly say that there are > no Unicode characters with those codepoints. Correct. The strawman says "The String type is the set of all finite ordered sequences of zero or more 21-bit unsigned integer values (“elements”)." There is no exclusion for invalid code-points, so I was assuming when Allen talked about an encodeUTF16 function that he was purposely fuzzing the term "codepoint" to include the entire range, and that encodeUTF16(oneSupplemental).charCodeAt(0) === 0xd800. >> How can the codepoints U+D800 U+DC00 be distinguished in a DOMString >> implementation that uses UTF-16 under the hood from the codepoint >> U+10000? > > They don't have to be; if 0xD800 0xDC00 are present (in that order) then > they encode U+10000. If they're present on their own, it's not a valid > UTF-16 string, hence not a valid DOMString and some sort of error-handling > behavior (which presumably needs defining) needs to take place. > That said, defining JS strings and DOMString differently seems like a recipe > for serious author confusion (e.g. actually using JS strings as the > DOMString binding in ES might be lossy, assigning from JS strings to > DOMString might be lossy, etc). It's a minefield. Agreed. It is a minefield and one that could benefit from treatment in the strawman. > -Boris > _______________________________________________ > es-discuss mailing list > es-discuss@mozilla.org > https://mail.mozilla.org/listinfo/es-discuss > _______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss