[ 
https://issues.apache.org/jira/browse/THRIFT-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478608#comment-13478608
 ] 

XB commented on THRIFT-1727:
----------------------------

{quote}Are there places where 'convert_to_utf8_buffer' is used for things other 
than Thrift 'string' fields?{quote}
Yes. Everywhere where 'convert_to_utf8_buffer' is used.

Just create a thrift "binary" field and try to write 
"\x01\x23\x45\x67\x89\xAB\xCD\xEF" (8 bytes) into that field and serialize this 
field and during serialization watch whether 'convert_to_utf8_buffer' is called 
with this pattern. You will observe that it is actually the case that 
'convert_to_utf8_buffer' is called with this pattern.

You will see that
{quote}The problem lies within the confusion between thrift "binary" fields and 
thrift "string" fields.{quote}

                
> Ruby-1.9: data loss: "binary" fields are re-encoded
> ---------------------------------------------------
>
>                 Key: THRIFT-1727
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1727
>             Project: Thrift
>          Issue Type: Bug
>          Components: Ruby - Library
>    Affects Versions: 0.9
>         Environment: JRuby 1.6.8 using "--1.9" command line parameter.
>            Reporter: XB
>
> When setting a binary field of a Thrift object with some binary data (e.g. a 
> string whose encoding is "ASCII-8BIT") and then serializing this object, the 
> binary data is re-encoded. That is, it is encoded as if it were not a 
> sequence of bytes but a sequence of characters, encoded using the ISO-8859-1 
> encoding. This assumed ISO-8859-1 sequence of characters is then converted 
> into UTF-8 (by BinaryProtocol or CompactProtocol). This basically means that 
> all bytes whose values are between 0x80 (inclusive) and 0x100 (exclusive) are 
> converted into multi-byte sequences. This leads to data corruption.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to