clbarnes commented on issue #49058: URL: https://github.com/apache/arrow/issues/49058#issuecomment-3847858522
I don't know enough about the pyarrow internals to give any direction on how this would be implemented, and don't have enough cython/c++ background to make a lot of sense of what exists currently. It looks like C++ doesn't know or care about the encoding of a string, it just treats it as bytes, so maybe that's the problem. It would be stupid to have to decode the bytes in python every time you read and encode them again every time you write. Either way, this _is_ a bug, because Schema.fbs is the normative description and that requires UTF-8. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
