> On 7 dec 2014, at 19:05, John Cowan <co...@mercury.ccil.org> wrote:
> 
> Patrik Fältström scripsit:
> 
>> But it also reference RFC7159, which doesn't require UTF-8 but instead
>> for some weird reason also allow other encodings of Unicode text. And
>> on top of that it says Byte Order Mark is not allowed.
> 
> 7159 was meant to tighten the wording of 4627, not to impose additional
> constraints on it.  For that, see the I-JSON draft.

The problem I have is that 7159 is not tight enough as it allows other 
encodings than UTF-8, which in turn make the encoding not work very well as 
this draft take for granted each one of the separator characters is one byte 
each.

I.e. the way I read draft-ietf-json-text-sequence (and I might be wrong), you 
have specific octet values that act as separators. That only works if the 
encoding is UTF-8.

See Figure 1:

> possible-JSON = 1*(not-RS); attempt to parse as UTF-8-encoded
>                                ; JSON text (see RFC7159)

Now, if this is NOT UTF-8, then this might be pretty bad situation.

What I am saying is that I would like this draft to explicitly say that the 
only profile of RFC7159 that can be used is when UTF-8 is in use, i.e. 
somewhere something like "The encoding MUST be UTF-8, although RFC7159 also 
allow other encodings, like UTF-16." Then in the security considerations 
section that "RFC7159 do allow not only UTF-8 encoding but also for example 
UTF-16, which MIGHT create problems for a parser, all depending on what data is 
serialized."

I.e. I want this draft to be even more tight than RFC7159.

Let me ask it this way: is there any reason to allow other encodings than 
UTF-8? If so, how do you handle the encoding of the separators?

>> This together implies that first of all this draft might not lead to
>> stable implementations, secondly one can not store in JSON strings
>> that include the Byte Order Mark, and there are other unspecified
>> situations.
> 
> If by that you mean that a JSON string may not contain U+FEFF, that is
> incorrect, for U+FEFF is recognized as a BOM only when placed at the
> beginning of an entity body, whereas an entity body in JSON format can
> begin only with [ or { classically, or by extension with [0-9"tfn].

Ok, so what you say is that a string in an attribute value in the JSON blob can 
still start with U+FEFF?

If so, good, and my apologies for not understanding this at my read of the text.

   Patrik

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Gen-art mailing list
Gen-art@ietf.org
https://www.ietf.org/mailman/listinfo/gen-art

Reply via email to