progger-dev commented on code in PR #41257: URL: https://github.com/apache/arrow/pull/41257#discussion_r1575445365
########## docs/source/format/CanonicalExtensions.rst: ########## @@ -251,6 +251,27 @@ Variable shape tensor Values inside each **data** tensor element are stored in row-major/C-contiguous order according to the corresponding **shape**. +.. _json_extension: + +JSON +==== + +* Extension name: `arrow.json`. + +* The storage type of this extension is ``StringArray`` or + or ``LargeStringArray`` or ``StringViewArray``. + Only UTF-8 encoded JSON is supported. + +* Extension type parameters: + + This type does not have any parameters. + +* Description of the serialization: + + Metadata is either an empty string or a JSON string with an empty object. + In the future, additional fields may be added, but they are not required + to interpret the array. Review Comment: I totally agree. I don't think we should canonicalize these non-standard extensions. I don't think producers should generate non-standard JSON. But I think consumers should be allowed to parse non-standard JSON. This is consistent with the spec in [RFC-8259](https://datatracker.ietf.org/doc/html/rfc8259#section-9). This also begs the question of what we do when JSON parsing fails. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org