[ https://issues.apache.org/jira/browse/THRIFT-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151134#comment-13151134 ]
Alexis edited comment on THRIFT-1224 at 11/17/11 6:02 PM: ---------------------------------------------------------- That's a Ruby 1.9 issue. As suggested we convert non ASCII strings to binary before writing them: In ~/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb, this is a patch suggestion: {code} 110 def write_string(str) 111 if str.encoding.to_s != "US-ASCII" 112 str = str.unpack("a*").first 113 end 114 write_i32(str.length) 115 trans.write(str) 116 end {code} was (Author: alexis779): That's a Ruby 1.9 issue. Trying to convert UTF-8 strings in write /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb with something like a force_encoding("UTF-8"): {code} 85 str = sz ? buf[0...sz] : buf 86 if str.encoding.to_s == "UTF-8" 87 str = str.unpack("a*").first 88 end 89 @wbuf << str {code} there is an exception for socket readability: "Socket: Timed out reading 4 bytes from 127.0.0.1:9160". Any clue? {noformat} /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/socket.rb:109:in `read': CassandraThrift::Cassandra::Client::TransportException from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/base_transport.rb:87:in `read_all' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb:105:in `read_frame' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/transport/framed_transport.rb:69:in `read_into_buffer' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb:192:in `read_i32' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/protocol/binary_protocol.rb:118:in `read_message_begin' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift-0.7.0/lib/thrift/client.rb:45:in `receive_message' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:251:in `recv_batch_mutate' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:243:in `batch_mutate' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift_client-0.7.1/lib/thrift_client/abstract_thrift_client.rb:150:in `handled_proxy' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/thrift_client-0.7.1/lib/thrift_client/abstract_thrift_client.rb:60:in `batch_mutate' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/lib/cassandra/protocol.rb:7:in `_mutate' from /Users/alexis/.rvm/gems/ruby-1.9.2-p290/gems/cassandra-0.12.1/lib/cassandra/cassandra.rb:459:in `insert' {noformat} > Cannot insert UTF-8 text > ------------------------ > > Key: THRIFT-1224 > URL: https://issues.apache.org/jira/browse/THRIFT-1224 > Project: Thrift > Issue Type: Bug > Components: Ruby - Library > Affects Versions: 0.6 > Environment: Ruby 1.9.2, Cassandra 0.8, thrift_client gem 0.6.2, > cassandra gem 0.11.1 > Reporter: Alessandro Morandi > Labels: charset, encoding, ruby, utf, utf-8, utf8 > Fix For: 0.8 > > > I can't seem to find a way to save UTF-8 data into Cassandra. > I'm using the cassandra gem 0.11.1 (https://github.com/fauna/cassandra/) > which in turn uses thrift_client (0.6.2), which in turns uses the thrift > library (0.6.0). > As an example, the following code[1] > bq. cassandra.insert(:Cache, "123", {"unicode string" => "รค"}) > will raise an exception: > bq. {{Encoding::CompatibilityError: incompatible character encodings: > ASCII-8BIT and UTF-8}} > The stacktrace points to > `thrift-0.6.0/lib/thrift/transport/framed_transport.rb:58`. What seems to be > happening is that `@wbuf` is encoded as ASCII-8BIT, while `buf` is encoded as > UTF-8, which causes the concatenation operation (<<) to fail with the > exception above. > This issue might be connected to > https://issues.apache.org/jira/browse/THRIFT-1023. > [1] Of course, this assumes a "cassandra" object created using the cassandra > gem and a schema inizialized with a column family called "Cache" -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira