[
https://issues.apache.org/jira/browse/THRIFT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688347#action_12688347
]
Jonathan Ellis commented on THRIFT-395:
---------------------------------------
In other words, my patch conforms to normal UTF-8-supporting behavior:
1. pass a unicode object to a `string` field -> works
2. pass a binary string containing ascii characters to a `string` field ->
works
3. pass binary data to a `binary` field -> works
4. pass arbitrary binary data to a `string` field -> doesn't work
Pre-patch, the python api would allow case four, but this was a bug, because
any server conforming to the thrift wire protocol (i.e. anything but another
buggy python server) would try to decode from utf-8 and get garbage. Switching
from `string` to `binary` is the right fix for code in this situation.
> Python library + compiler does not support unicode strings
> ----------------------------------------------------------
>
> Key: THRIFT-395
> URL: https://issues.apache.org/jira/browse/THRIFT-395
> Project: Thrift
> Issue Type: Bug
> Components: Compiler (Python)
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Blocker
> Attachments: python-utf8.patch
>
>
> Effectively, all strings in the python bindings are treated as binary strings
> -- no encoding/decoding to UTF-8 is done. So if a unicode object is passed
> to a (regular, non-binary) string, an exception is raised.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.