westonpace commented on code in PR #41257: URL: https://github.com/apache/arrow/pull/41257#discussion_r1589739730
########## docs/source/format/CanonicalExtensions.rst: ########## @@ -251,6 +251,27 @@ Variable shape tensor Values inside each **data** tensor element are stored in row-major/C-contiguous order according to the corresponding **shape**. +.. _json_extension: + +JSON +==== + +* Extension name: `arrow.json`. + +* The storage type of this extension is ``StringArray`` or + or ``LargeStringArray`` or ``StringViewArray``. + Only UTF-8 encoded JSON is supported. Review Comment: > By this you mean serializing binary data into UTF-8 string and including it as a String? What would be a downside to allowing that? I think the argument was that we might want to support `BinaryArray` as a storage type since a JSON document could be encoded with a non-utf8 encoding and thus should not be stored in a `StringArray`. However, I agree we probably don't want to worry about this since the RFC is pretty clear that JSON must be UTF-8 and users can always make up their own extension type if they need to. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org