[ 
https://issues.apache.org/jira/browse/THRIFT-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151134#comment-13151134
 ] 

Alexis edited comment on THRIFT-1224 at 11/17/11 6:02 PM:
----------------------------------------------------------

That's a Ruby 1.9 issue.
As suggested we convert non ASCII strings to binary before writing them: In 
~/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb,
 this is a patch suggestion:

{code}
110     def write_string(str)
111       if str.encoding.to_s != "US-ASCII"
112         str = str.unpack("a*").first
113       end
114       write_i32(str.length)
115       trans.write(str)
116     end
{code}

                
      was (Author: alexis779):
    That's a Ruby 1.9 issue.
Trying to convert UTF-8 strings in write 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb
 with something like a force_encoding("UTF-8"):

{code}
 85       str = sz ? buf[0...sz] : buf
 86       if str.encoding.to_s == "UTF-8"
 87         str = str.unpack("a*").first
 88       end
 89       @wbuf << str
{code}

there is an exception for socket readability: "Socket: Timed out reading 4 
bytes from 127.0.0.1:9160". Any clue?

{noformat}
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/socket.rb:109:in
 `read': CassandraThrift::Cassandra::Client::TransportException
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/base_transport.rb:87:in
 `read_all'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb:105:in
 `read_frame'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb:69:in
 `read_into_buffer'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb:192:in
 `read_i32'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb:118:in
 `read_message_begin'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/client.rb:45:in
 `receive_message'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:251:in
 `recv_batch_mutate'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:243:in
 `batch_mutate'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift_client-0.7.1/lib/thrift_client/abstract_thrift_client.rb:150:in
 `handled_proxy'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift_client-0.7.1/lib/thrift_client/abstract_thrift_client.rb:60:in
 `batch_mutate'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/lib/cassandra/protocol.rb:7:in
 `_mutate'
        from 
/Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/lib/cassandra/cassandra.rb:459:in
 `insert'
{noformat} 
                  
> Cannot insert UTF-8 text
> ------------------------
>
>                 Key: THRIFT-1224
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1224
>             Project: Thrift
>          Issue Type: Bug
>          Components: Ruby - Library
>    Affects Versions: 0.6
>         Environment: Ruby 1.9.2, Cassandra 0.8, thrift_client gem 0.6.2, 
> cassandra gem 0.11.1
>            Reporter: Alessandro Morandi
>              Labels: charset, encoding, ruby, utf, utf-8, utf8
>             Fix For: 0.8
>
>
> I can't seem to find a way to save UTF-8 data into Cassandra.
> I'm using the cassandra gem 0.11.1 (https://github.com/fauna/cassandra/) 
> which in turn uses thrift_client (0.6.2), which in turns uses the thrift 
> library (0.6.0).
> As an example, the following code[1]
> bq. cassandra.insert(:Cache, "123", {"unicode string" => "รค"})
> will raise an exception:
> bq.    {{Encoding::CompatibilityError: incompatible character encodings: 
> ASCII-8BIT and UTF-8}}
> The stacktrace points to 
> `thrift-0.6.0/lib/thrift/transport/framed_transport.rb:58`. What seems to be 
> happening is that `@wbuf` is encoded as ASCII-8BIT, while `buf` is encoded as 
> UTF-8, which causes the concatenation operation (<<) to fail with the 
> exception above.
> This issue might be connected to 
> https://issues.apache.org/jira/browse/THRIFT-1023.
> [1] Of course, this assumes a "cassandra" object created using the cassandra 
> gem and a schema inizialized with a column family called "Cache"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to