You raise valid points - notably about how JSON isn't a great target in general.
On 6/17/20 8:42 AM, Serhiy Storchaka wrote: > 1. Initial standard allowed only JSON objects and JSON arrays at the top > level, but Python implementations allowed all. Now the standard has been > changed. This seems like a non-issue now, though - if we're explicitly making the decision that RFC8259 is what we refer to as the standard Python should conform to. > 2. Initial standard allowed binary input and suggested algorithm to > determine the encoding (if it one of UTF-8, UTF-16, UTF-32 with > variations). Current standard requires UTF-8 encoding. Python > implementation uses the above algorithm (with variation). You can also > use arbitrary explicit encoding. Good point - but the non-conformity is on the deserialization side, and serialization/encoding is conformant (with an option to break conformity). I still feel strongly about focusing on the encoding path for now (while responsibly keeping in mind the state of the decoding/deserialization path - trying to also fix it if some sort of rfc-compatibility is indeed implemented). > 3. Python implementation supports integers of arbitrary size. Other > implementations can be limited to 32- or 64-bit integers. > > 4. Python implementation is limited to precision and range of IEEE-754 > for non-integer numbers. Other implementations can support larger > precision and range. > > 5. Python implementation supports single surrogate characters in > strings. Other implementations can be limited. 3, 4 & 5 are unfortunate effects of the JSON spec being Not Very Good. However, we can still strive to declare conformity to the spec while being potentially incompatible with some other implementations. > 6. Python implementation can produce JSON objects with duplicated keys > (and their order was unspecified before 3.6), for example when serialize > {1: 1, "1": 2}. RFC8529 doesn't prohibit this [1] by weaseling out^W^Wsaying that they SHOULD be unique. So this is technically conformant, but I personally wouldn't mind some way of optionally making serialization more strict in this regard. > So there is more than one meaning in the term "strict", and it may be > changed with changing the JSON standard. So I think there's two separate things here: 1) Strictness wrt. to the standard: with allow_nan set to False I can't see how the implementation is not strict/conformant wrt. RFC 8259, at least on the serialization side - so I don't see any issues with evolving this option into a strict/conformant mode. Naturally the question arises on whether replacements to RFC 8259 won't break this compatibility - but this is something that is possible with any standard. JSON does indeed not have the greatest history in this regards, but with RFC8259 I feel that we can hope for a glimpse of stability and responsible future revisions. 2) Compatibility with other implementations: as you noted in your points 3, 4, 5 and 6, even strict conformity to the standard does not guarantee interoperability with other implementations for some data. However, I still think that this isn't a problem if strictness/conformity is explicitly defined to be conforming to a given standard, and explaining that even with that there is no guarantee for interoperability. Perhaps 'strict' isn't the best term because of its vagueness - but would you agree that some sort of named, conformant-with-RFC82599 option (eg. 'conformant', or 'rfc_compat{,ible,ibility}') would be a better choice than 'allow_nan=False'? [1] - https://tools.ietf.org/html/rfc8259#section-4 “The names within an object SHOULD be unique.” _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/XKMBH6AHAT7GYEOHHI77IMFSI4ADG3S3/ Code of Conduct: http://python.org/psf/codeofconduct/