[ 
https://issues.apache.org/jira/browse/THRIFT-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139529#comment-17139529
 ] 

Emmanuel Brard commented on THRIFT-2087:
----------------------------------------

Is there any update on this story? The python lib behaves differently from the 
Java one where `String(buf, StandardCharsets.UTF_8)` for example simply replace 
malformed character.

> unicode decode errors
> ---------------------
>
>                 Key: THRIFT-2087
>                 URL: https://issues.apache.org/jira/browse/THRIFT-2087
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Python - Compiler
>    Affects Versions: 0.9
>         Environment: Ubuntu 12.10 (GNU/Linux 3.5.0-17-generic x86_64), python 
> 2.7
>            Reporter: Aleksey Maslennikov
>            Priority: Minor
>         Attachments: thrift-2087-unicode-decode-errors.patch
>
>
> If the supplied string is not valid utf-8 there is an exception:
> self.match = iprot.readString().decode('utf-8')
>   File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode byte 0xf2 in position 0: 
> invalid continuation byte



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to