rok commented on code in PR #41257: URL: https://github.com/apache/arrow/pull/41257#discussion_r1585212898
########## docs/source/format/CanonicalExtensions.rst: ########## @@ -251,6 +251,27 @@ Variable shape tensor Values inside each **data** tensor element are stored in row-major/C-contiguous order according to the corresponding **shape**. +.. _json_extension: + +JSON +==== + +* Extension name: `arrow.json`. + +* The storage type of this extension is ``StringArray`` or + or ``LargeStringArray`` or ``StringViewArray``. + Only UTF-8 encoded JSON is supported. Review Comment: Does this make sense @emkornfield ? ```suggestion Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported. Non-standard encodings using Binary types are not supported. ``` > In the short term we should probably recommend that implementations should likely reject Binary annotated JSON fields By this you mean serializing binary data into UTF-8 string and including it as a String? Is there a downside to allowing it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org