[
https://issues.apache.org/jira/browse/THRIFT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729145#action_12729145
]
Terry Jones commented on THRIFT-395:
------------------------------------
It seems that a decision/consensus was almost reached here, specifically
David's suggestion at http://bit.ly/ofFr0
Can we re-animate this issue and get it resolved? I somehow skipped this
discussion when it was going on as I knew (or thought I knew) that strings were
sent as UTF-8 and was mistakenly assuming that the Python support did the Right
Thing and that if an app passed a Python unicode object in a call you'd get a
Python unicode object out on the other end. Last night I found out to my great
surprise that that's not the case.
It would be *really* nice to have this resolved. Otherwise it's going to mean a
bunch of crufty manual coding decoding. And it's made worse in our case as we
have a dozen internal services that all speak to each other extensively using
Thrift. So not only do we need to deal with outside clients being able to
somehow pass unicode, we'd have to manually decode each arg in each method in
each service, and then manually encode them again to call another Thrift method
inside our own service. Either that or keep things as UTF-8 strings, which
isn't an option.
The patches are in, and backwards compatibility is not an issue with David's
suggestion. Real users need it ASAP to avoid real pain :-) What's still
stopping this from being resolved/applied/committed?
Terry
> Python library + compiler does not support unicode strings
> ----------------------------------------------------------
>
> Key: THRIFT-395
> URL: https://issues.apache.org/jira/browse/THRIFT-395
> Project: Thrift
> Issue Type: Improvement
> Components: Compiler (Python), Library (Python)
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.2
>
> Attachments:
> 0001-python-Minor-cleanup-of-protocols-don-t-use-str.patch,
> 0002-THRIFT-395.-python-Phase-One-of-support-for-unicode.patch,
> 0003-THRIFT-395.-python-Phase-Two-of-support-for-unicode.patch,
> 0004-python-Remove-ridiculous-semicolons-from-gen-code.patch,
> python-utf8-v2.patch, python-utf8.patch
>
>
> Effectively, all strings in the python bindings are treated as binary strings
> -- no encoding/decoding to UTF-8 is done. So if a unicode object is passed
> to a (regular, non-binary) string, an exception is raised.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.