1/ According to ECMA-404, 1st edition / October 2013, a JSON text is a sequence of Unicode code points. The code points that can appear in a JSON text include all code point except the control characters (the text says U+0000 to U+001F but the syntax diagram just says control character, which in Unicode 6.3 also includes U+007F to U+009F). Therefore, the code point sequence <0022, DEAD, 0022> is a valid JSON text.
However, this code point sequence cannot be represented in UTF-8, UTF-16, or UTF-32, as it is not a sequence of Unicode scalar values, and Unicode encoding forms are only defined on Unicode scalar values. 2/ The unescaping of strings in JSON is ill-defined as there are quoted JSON strings that are the escaped version of two different sequences of Unicode code points. For example both <D834, DD1E> and <1D11E> can be represented as "\uD834\uDD1E". Both of these appear to be bugs that should be fixed. Peter F. Patel-Schneider _______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss