tustvold commented on PR #4967:
URL: https://github.com/apache/arrow-rs/pull/4967#issuecomment-1773451728

   Could you provide an example of what the contents of this binary file are 
encoded as? Assuming the data is valid JSON, they must be valid UTF-8, and can 
therefore be read as a regular StringArray, i.e. DataType::Utf8.
   
   Now the question is what does the downstream expect, e.g. if the data in the 
JSON file is actually base64 encoded, do they expect a BinaryArray with this 
data decoded? Or is the fact the data is labelled binary just a misnomer and 
the data is actually UTF-8 but the charset has just been lost somewhere down 
the line?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to