progger-dev commented on code in PR #41257:
URL: https://github.com/apache/arrow/pull/41257#discussion_r1575445365


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -251,6 +251,27 @@ Variable shape tensor
    Values inside each **data** tensor element are stored in 
row-major/C-contiguous
    order according to the corresponding **shape**.
 
+.. _json_extension:
+
+JSON
+====
+
+* Extension name: `arrow.json`.
+
+* The storage type of this extension is ``StringArray`` or
+  or ``LargeStringArray`` or ``StringViewArray``.
+  Only UTF-8 encoded JSON is supported.
+
+* Extension type parameters:
+
+  This type does not have any parameters.
+
+* Description of the serialization:
+
+  Metadata is either an empty string or a JSON string with an empty object.
+  In the future, additional fields may be added, but they are not required
+  to interpret the array.

Review Comment:
   I totally agree. I don't think we should canonicalize these non-standard 
extensions. I don't think producers should generate non-standard JSON. But I 
think consumers should be allowed to parse non-standard JSON. This is 
consistent with the spec in 
[RFC-8259](https://datatracker.ietf.org/doc/html/rfc8259#section-9).
   
   This also begs the question of what we do when JSON parsing fails.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to