Re: avro + cassandra + ruby
Full list of client options and defaults: https://github.com/fauna/thrift_client/blob/master/lib/thrift_client/abstract_thrift_client.rb#L28-43 On Wed, Nov 17, 2010 at 10:13 AM, Benjamin Black wrote: > Cassandra.new(keyspace, server, {:protocol => > Thrift::BinaryProtocolAccelerated}) > > On Tue, Nov 16, 2010 at 5:13 PM, Ryan King wrote: >> On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis wrote: >>> On Tue, Sep 28, 2010 at 6:35 PM, Ryan King wrote: One thing you should try is to make thrift use BinaryProtocolAccelerated, rather than the pure-ruby implementation (we should change the default). >>> >>> Dumb question time: how do you do this? >>> >>> $ find . -name "*.rb" |xargs grep -i binaryprotocol >>> >>> in the fauna cassandra gem repo turns up no hits. >> >> I believe we're relying on the default from thrift_client (which >> defaults to BinaryProtocol): https://github.com/fauna/thrift_client/ >> >> -ryan >> >
Re: avro + cassandra + ruby
Cassandra.new(keyspace, server, {:protocol => Thrift::BinaryProtocolAccelerated}) On Tue, Nov 16, 2010 at 5:13 PM, Ryan King wrote: > On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis wrote: >> On Tue, Sep 28, 2010 at 6:35 PM, Ryan King wrote: >>> One thing you should try is to make thrift use >>> BinaryProtocolAccelerated, rather than the pure-ruby implementation >>> (we should change the default). >> >> Dumb question time: how do you do this? >> >> $ find . -name "*.rb" |xargs grep -i binaryprotocol >> >> in the fauna cassandra gem repo turns up no hits. > > I believe we're relying on the default from thrift_client (which > defaults to BinaryProtocol): https://github.com/fauna/thrift_client/ > > -ryan >
Re: avro + cassandra + ruby
On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis wrote: > On Tue, Sep 28, 2010 at 6:35 PM, Ryan King wrote: >> One thing you should try is to make thrift use >> BinaryProtocolAccelerated, rather than the pure-ruby implementation >> (we should change the default). > > Dumb question time: how do you do this? > > $ find . -name "*.rb" |xargs grep -i binaryprotocol > > in the fauna cassandra gem repo turns up no hits. I believe we're relying on the default from thrift_client (which defaults to BinaryProtocol): https://github.com/fauna/thrift_client/ -ryan
Re: avro + cassandra + ruby
On Tue, Sep 28, 2010 at 6:35 PM, Ryan King wrote: > One thing you should try is to make thrift use > BinaryProtocolAccelerated, rather than the pure-ruby implementation > (we should change the default). Dumb question time: how do you do this? $ find . -name "*.rb" |xargs grep -i binaryprotocol in the fauna cassandra gem repo turns up no hits. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: avro + cassandra + ruby
On Thu, Sep 30, 2010 at 1:08 PM, Gabor Torok wrote: > I added a comment to an existing issue: > https://issues.apache.org/jira/browse/AVRO-537 Cool. I'll work with Jeff (who sits about 10 feet from me) to get this fixed. :) -ryan
RE: avro + cassandra + ruby
I added a comment to an existing issue: https://issues.apache.org/jira/browse/AVRO-537 Thanks, --Gabor
RE: avro + cassandra + ruby
Coool. Would you mind opening an Avro issue for that, or should I? -Original Message- From: "Gabor Torok" Sent: Thursday, September 30, 2010 2:36pm To: "user@cassandra.apache.org" Subject: RE: avro + cassandra + ruby The ruby code creates a new http connection for each call to transceive. Here is what I changed to make it work: gabor$ diff /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb ~/avro-trunk/lang/ruby/lib/avro/ipc.rb 518d517 < require "net/http" 525d523 < @conn = Net::HTTP.start host, port 531c529,531 < resp = @conn.post('/', writer.to_s, {'Content-Type' => 'avro/binary'}) --- > resp = Net::HTTP.start(host, port) do |http| > http.post('/', writer.to_s, {'Content-Type' => 'avro/binary'}) > end Thanks, --Gabor
Re: avro + cassandra + ruby
If you turn logging up to DEBUG do you see any lines such as this when running your script.DEBUG [pool-1-thread-22] 2010-10-01 08:22:12,723 ClientState.java (line 107) logged out: #And out of interest, if you send two multiget_slice calls, do they both log from the same thread? e.g. DEBUG [pool-1-thread-26] 2010-10-01 08:37:08,832 CassandraServer.java (line 264) multiget_sliceDEBUG [pool-1-thread-26] 2010-10-01 08:37:14,825 CassandraServer.java (line 264) multiget_sliceShows they both ran on the same thread / connection. AaronOn 01 Oct, 2010,at 07:18 AM, Gabor Torok wrote:I ran the python code like Gary suggested and it succeeded. (As an aside, it would be nice if I could run the python code for avro without having thrift installed.) Then I re-created my ruby example in python (essentially calling set_keyspace and then making a multiget_slice call) and it also succeeded. When I try the same in ruby, it fails with the cassandra giving a KeyspaceNotDefinedException. I'll keep digging and let you know what I find. --Gabor
RE: avro + cassandra + ruby
The ruby code creates a new http connection for each call to transceive. Here is what I changed to make it work: gabor$ diff /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb ~/avro-trunk/lang/ruby/lib/avro/ipc.rb 518d517 < require "net/http" 525d523 < @conn = Net::HTTP.start host, port 531c529,531 < resp = @conn.post('/', writer.to_s, {'Content-Type' => 'avro/binary'}) --- > resp = Net::HTTP.start(host, port) do |http| > http.post('/', writer.to_s, {'Content-Type' => 'avro/binary'}) > end Thanks, --Gabor
Re: avro + cassandra + ruby
I ran the python code like Gary suggested and it succeeded. (As an aside, it would be nice if I could run the python code for avro without having thrift installed.) Then I re-created my ruby example in python (essentially calling set_keyspace and then making a multiget_slice call) and it also succeeded. When I try the same in ruby, it fails with the cassandra giving a KeyspaceNotDefinedException. I'll keep digging and let you know what I find. --Gabor
Re: avro + cassandra + ruby
The server exception is: WARN 09:49:56,644 user error org.apache.cassandra.avro.KeyspaceNotDefinedException at org.apache.cassandra.avro.AvroValidation.validateKeyspace(AvroValidation.java:73) at org.apache.cassandra.avro.AvroValidation.validateColumnParent(AvroValidation.java:121) at org.apache.cassandra.avro.CassandraServer.multigetSliceInternal(CassandraServer.java:311) at org.apache.cassandra.avro.CassandraServer.multiget_slice(CassandraServer.java:379) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.avro.specific.SpecificResponder.respond(SpecificResponder.java:93) at org.apache.avro.ipc.Responder.respond(Responder.java:136) at org.apache.avro.ipc.Responder.respond(Responder.java:88) at org.apache.avro.ipc.ResponderServlet.doPost(ResponderServlet.java:48) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:536) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:930) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:405) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:637) The ruby client says: /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:247:in `match_schemas': undefined local variable or method `writers_scheam' for Avro::IO::DatumReader:Class (NameError) from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:295:in `read_data' from /usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `find' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in `each' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in `find' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in `read_data' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:383:in `read_union' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:316:in `read_data' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:282:in `read' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:227:in `read_error' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:216:in `read_call_response' from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:113:in `request' from ./avro_test.rb:19 "writers_scheam" is a typo bug :-) My client code is: avro_protocol = File.open("config/cassandra.avpr", "r").read json_avro = JSON.parse(avro_protocol) protocol = Avro::Protocol.parse(avro_protocol) transport = Avro::IPC::HTTPTransceiver.new("localhost", 9160) requestor = Avro::IPC::Requestor.new(protocol, transport) requestor.request("set_keyspace", "keyspace" => "TMAC") requestor.request("multiget_slice", "keys"=>... Looking in the server code, it seems java is expecting all avro communications to be stateful (or some other way of matching threads to callers.) Thanks, --Gabor
Re: avro + cassandra + ruby
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok wrote: > Hi, > I'm attempting to use avro to talk to cassandra because the ruby thrift > client's read performance is pretty bad (I measured 4x slower than java). > > However, I run into a problem when calling multiget_slice. > The server gives a KeyspaceNotDefinedException because > clientState.getKeyspace() returns null. > It seems this is because ClientState stores the keyspace in a ThreadLocal. > > I call set_keyspace and clientState stores the keyspace value. I guess the > next avro call to multiget_slice runs in a different thread so it can't > retrieve the value. > > In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is > a stateless transport. I also tried SocketTransport, but that died with a > malloc exception. Was this exception on the server or in the client? The ruby avro code is pretty new, so the probability of bugs is pretty high. -ryan > Is this a problem with the ruby avro library (I use avro 1.4.0), or how the > server handles avro threads? > Any help would be appreciated! > > Thanks, > --Gabor >
Re: avro + cassandra + ruby
We have a system test that tests this (in avro python). see test/system/test_avro_standard.py:TestStandardOperations.test_multiget_slice_simple. On Wed, Sep 29, 2010 at 01:06, Gabor Torok wrote: > Hi, > I'm attempting to use avro to talk to cassandra because the ruby thrift > client's read performance is pretty bad (I measured 4x slower than java). > > However, I run into a problem when calling multiget_slice. > The server gives a KeyspaceNotDefinedException because > clientState.getKeyspace() returns null. > It seems this is because ClientState stores the keyspace in a ThreadLocal. > > I call set_keyspace and clientState stores the keyspace value. I guess the > next avro call to multiget_slice runs in a different thread so it can't > retrieve the value. > > In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is > a stateless transport. I also tried SocketTransport, but that died with a > malloc exception. > > Is this a problem with the ruby avro library (I use avro 1.4.0), or how the > server handles avro threads? > Any help would be appreciated! > > Thanks, > --Gabor >
Re: avro + cassandra + ruby
Thanks, that made things better by about 30%. Unfortunately for me that's still unacceptable... :-( I feel like I'm doing something wrong with avro (see my original post). Was anyone able to make it work?
Re: avro + cassandra + ruby
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok wrote: > Hi, > I'm attempting to use avro to talk to cassandra because the ruby thrift > client's read performance is pretty bad (I measured 4x slower than java). Only 4x feels like a win. :) One thing you should try is to make thrift use BinaryProtocolAccelerated, rather than the pure-ruby implementation (we should change the default). -ryan > However, I run into a problem when calling multiget_slice. > The server gives a KeyspaceNotDefinedException because > clientState.getKeyspace() returns null. > It seems this is because ClientState stores the keyspace in a ThreadLocal. > > I call set_keyspace and clientState stores the keyspace value. I guess the > next avro call to multiget_slice runs in a different thread so it can't > retrieve the value. > > In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is > a stateless transport. I also tried SocketTransport, but that died with a > malloc exception. > > Is this a problem with the ruby avro library (I use avro 1.4.0), or how the > server handles avro threads? > Any help would be appreciated! > > Thanks, > --Gabor >
avro + cassandra + ruby
Hi, I'm attempting to use avro to talk to cassandra because the ruby thrift client's read performance is pretty bad (I measured 4x slower than java). However, I run into a problem when calling multiget_slice. The server gives a KeyspaceNotDefinedException because clientState.getKeyspace() returns null. It seems this is because ClientState stores the keyspace in a ThreadLocal. I call set_keyspace and clientState stores the keyspace value. I guess the next avro call to multiget_slice runs in a different thread so it can't retrieve the value. In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is a stateless transport. I also tried SocketTransport, but that died with a malloc exception. Is this a problem with the ruby avro library (I use avro 1.4.0), or how the server handles avro threads? Any help would be appreciated! Thanks, --Gabor