Joris Van den Bossche created ARROW-15552:
---------------------------------------------

             Summary: [Docs][Format] Unclear wording about base64 encoding 
requirement of metadata values
                 Key: ARROW-15552
                 URL: https://issues.apache.org/jira/browse/ARROW-15552
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Documentation, Format
            Reporter: Joris Van den Bossche


The C Data Interface docs indicate that the values in key-value metadata should 
be base64 encoded, which is mentioned in the section about which key-value 
metadata to use for extension types 
(https://arrow.apache.org/docs/format/CDataInterface.html#extension-arrays):

bq. The base64 encoding of metadata values ensures that any possible 
serialization is representable.

This might not be fully correct, though (or at least not required, which is 
implied with the current wording). While a binary blob (like a serialized 
schema) can be base64 encoded, as we do when putting the Arrow schema in the 
Parquet metadata, this is not required?

cc [~apitrou]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to