Re: avro + cassandra + ruby

2010-11-17 Thread Benjamin Black
Full list of client options and defaults:
https://github.com/fauna/thrift_client/blob/master/lib/thrift_client/abstract_thrift_client.rb#L28-43

On Wed, Nov 17, 2010 at 10:13 AM, Benjamin Black  wrote:
> Cassandra.new(keyspace, server, {:protocol =>
> Thrift::BinaryProtocolAccelerated})
>
> On Tue, Nov 16, 2010 at 5:13 PM, Ryan King  wrote:
>> On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis  wrote:
>>> On Tue, Sep 28, 2010 at 6:35 PM, Ryan King  wrote:
 One thing you should try is to make thrift use
 BinaryProtocolAccelerated, rather than the pure-ruby implementation
 (we should change the default).
>>>
>>> Dumb question time: how do you do this?
>>>
>>> $ find . -name "*.rb" |xargs grep -i binaryprotocol
>>>
>>> in the fauna cassandra gem repo turns up no hits.
>>
>> I believe we're relying on the default from thrift_client (which
>> defaults to BinaryProtocol): https://github.com/fauna/thrift_client/
>>
>> -ryan
>>
>


Re: avro + cassandra + ruby

2010-11-17 Thread Benjamin Black
Cassandra.new(keyspace, server, {:protocol =>
Thrift::BinaryProtocolAccelerated})

On Tue, Nov 16, 2010 at 5:13 PM, Ryan King  wrote:
> On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis  wrote:
>> On Tue, Sep 28, 2010 at 6:35 PM, Ryan King  wrote:
>>> One thing you should try is to make thrift use
>>> BinaryProtocolAccelerated, rather than the pure-ruby implementation
>>> (we should change the default).
>>
>> Dumb question time: how do you do this?
>>
>> $ find . -name "*.rb" |xargs grep -i binaryprotocol
>>
>> in the fauna cassandra gem repo turns up no hits.
>
> I believe we're relying on the default from thrift_client (which
> defaults to BinaryProtocol): https://github.com/fauna/thrift_client/
>
> -ryan
>


Re: avro + cassandra + ruby

2010-11-16 Thread Ryan King
On Tue, Nov 16, 2010 at 10:25 AM, Jonathan Ellis  wrote:
> On Tue, Sep 28, 2010 at 6:35 PM, Ryan King  wrote:
>> One thing you should try is to make thrift use
>> BinaryProtocolAccelerated, rather than the pure-ruby implementation
>> (we should change the default).
>
> Dumb question time: how do you do this?
>
> $ find . -name "*.rb" |xargs grep -i binaryprotocol
>
> in the fauna cassandra gem repo turns up no hits.

I believe we're relying on the default from thrift_client (which
defaults to BinaryProtocol): https://github.com/fauna/thrift_client/

-ryan


Re: avro + cassandra + ruby

2010-11-16 Thread Jonathan Ellis
On Tue, Sep 28, 2010 at 6:35 PM, Ryan King  wrote:
> One thing you should try is to make thrift use
> BinaryProtocolAccelerated, rather than the pure-ruby implementation
> (we should change the default).

Dumb question time: how do you do this?

$ find . -name "*.rb" |xargs grep -i binaryprotocol

in the fauna cassandra gem repo turns up no hits.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: avro + cassandra + ruby

2010-09-30 Thread Ryan King
On Thu, Sep 30, 2010 at 1:08 PM, Gabor Torok
 wrote:
> I added a comment to an existing issue:
> https://issues.apache.org/jira/browse/AVRO-537

Cool. I'll work with Jeff (who sits about 10 feet from me) to get this fixed. :)

-ryan


RE: avro + cassandra + ruby

2010-09-30 Thread Gabor Torok
I added a comment to an existing issue:
https://issues.apache.org/jira/browse/AVRO-537

Thanks,
--Gabor


RE: avro + cassandra + ruby

2010-09-30 Thread Stu Hood
Coool. Would you mind opening an Avro issue for that, or should I?

-Original Message-
From: "Gabor Torok" 
Sent: Thursday, September 30, 2010 2:36pm
To: "user@cassandra.apache.org" 
Subject: RE: avro + cassandra + ruby

The ruby code creates a new http connection for each call to transceive. Here 
is what I changed to make it work:

gabor$ diff /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb 
~/avro-trunk/lang/ruby/lib/avro/ipc.rb 
518d517
<   require "net/http"
525d523
<   @conn = Net::HTTP.start host, port
531c529,531
<   resp = @conn.post('/', writer.to_s, {'Content-Type' => 'avro/binary'})
---
>   resp = Net::HTTP.start(host, port) do |http|
> http.post('/', writer.to_s, {'Content-Type' => 'avro/binary'})
>   end

Thanks,
--Gabor




Re: avro + cassandra + ruby

2010-09-30 Thread Aaron Morton
If you turn logging up to DEBUG do you see any lines such as this when running your script.DEBUG [pool-1-thread-22] 2010-10-01 08:22:12,723 ClientState.java (line 107) logged out: #And out of interest, if you send two multiget_slice calls, do they both log from the same thread? e.g. DEBUG [pool-1-thread-26] 2010-10-01 08:37:08,832 CassandraServer.java (line 264) multiget_sliceDEBUG [pool-1-thread-26] 2010-10-01 08:37:14,825 CassandraServer.java (line 264) multiget_sliceShows they both ran on the same thread / connection. AaronOn 01 Oct, 2010,at 07:18 AM, Gabor Torok  wrote:I ran the python code like Gary suggested and it succeeded. (As an aside, it would be nice if I could run the python code for avro without having thrift installed.)

Then I re-created my ruby example in python (essentially calling set_keyspace and then making a multiget_slice call) and it also succeeded.

When I try the same in ruby, it fails with the cassandra giving a KeyspaceNotDefinedException.

I'll keep digging and let you know what I find.
--Gabor


RE: avro + cassandra + ruby

2010-09-30 Thread Gabor Torok
The ruby code creates a new http connection for each call to transceive. Here 
is what I changed to make it work:

gabor$ diff /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb 
~/avro-trunk/lang/ruby/lib/avro/ipc.rb 
518d517
<   require "net/http"
525d523
<   @conn = Net::HTTP.start host, port
531c529,531
<   resp = @conn.post('/', writer.to_s, {'Content-Type' => 'avro/binary'})
---
>   resp = Net::HTTP.start(host, port) do |http|
> http.post('/', writer.to_s, {'Content-Type' => 'avro/binary'})
>   end

Thanks,
--Gabor


Re: avro + cassandra + ruby

2010-09-30 Thread Gabor Torok
I ran the python code like Gary suggested and it succeeded. (As an aside, it 
would be nice if I could run the python code for avro without having thrift 
installed.)

Then I re-created my ruby example in python (essentially calling set_keyspace 
and then making a multiget_slice call) and it also succeeded.

When I try the same in ruby, it fails with the cassandra giving a 
KeyspaceNotDefinedException.

I'll keep digging and let you know what I find.
--Gabor


Re: avro + cassandra + ruby

2010-09-30 Thread Gabor Torok
The server exception is:

WARN 09:49:56,644 user error
org.apache.cassandra.avro.KeyspaceNotDefinedException
at 
org.apache.cassandra.avro.AvroValidation.validateKeyspace(AvroValidation.java:73)
at 
org.apache.cassandra.avro.AvroValidation.validateColumnParent(AvroValidation.java:121)
at 
org.apache.cassandra.avro.CassandraServer.multigetSliceInternal(CassandraServer.java:311)
at 
org.apache.cassandra.avro.CassandraServer.multiget_slice(CassandraServer.java:379)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.avro.specific.SpecificResponder.respond(SpecificResponder.java:93)
at org.apache.avro.ipc.Responder.respond(Responder.java:136)
at org.apache.avro.ipc.Responder.respond(Responder.java:88)
at org.apache.avro.ipc.ResponderServlet.doPost(ResponderServlet.java:48)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:536)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:930)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:405)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)

The ruby client says:  

/usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:247:in 
`match_schemas': undefined local variable or method `writers_scheam' for 
Avro::IO::DatumReader:Class (NameError)
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:295:in 
`read_data'
from /usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in 
`find'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in 
`each'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in 
`find'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:294:in 
`read_data'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:383:in 
`read_union'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:316:in 
`read_data'
from /usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/io.rb:282:in 
`read'
from 
/usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:227:in `read_error'
from 
/usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:216:in 
`read_call_response'
from 
/usr/local/lib/ruby/gems/1.8/gems/avro-1.4.0/lib/avro/ipc.rb:113:in `request'
from ./avro_test.rb:19

"writers_scheam" is a typo bug :-)

My client code is:

avro_protocol = File.open("config/cassandra.avpr", "r").read
json_avro = JSON.parse(avro_protocol)
protocol = Avro::Protocol.parse(avro_protocol)
transport = Avro::IPC::HTTPTransceiver.new("localhost", 9160)
requestor = Avro::IPC::Requestor.new(protocol, transport)

requestor.request("set_keyspace", "keyspace" => "TMAC")
requestor.request("multiget_slice", "keys"=>...

Looking in the server code, it seems java is expecting all avro communications 
to be stateful (or some other way of matching threads to callers.)

Thanks,
--Gabor


Re: avro + cassandra + ruby

2010-09-29 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok
 wrote:
> Hi,
> I'm attempting to use avro to talk to cassandra because the ruby thrift 
> client's read performance is pretty bad (I measured 4x slower than java).
>
> However, I run into a problem when calling multiget_slice.
> The server gives a KeyspaceNotDefinedException because 
> clientState.getKeyspace() returns null.
> It seems this is because ClientState stores the keyspace in a ThreadLocal.
>
> I call set_keyspace and clientState stores the keyspace value. I guess the 
> next avro call to multiget_slice runs in a different thread so it can't 
> retrieve the value.
>
> In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is 
> a stateless transport. I also tried SocketTransport, but that died with a 
> malloc exception.

Was this exception on the server or in the client? The ruby avro code
is pretty new, so the probability of bugs is pretty high.

-ryan

> Is this a problem with the ruby avro library (I use avro 1.4.0), or how the 
> server handles avro threads?
> Any help would be appreciated!
>
> Thanks,
> --Gabor
>


Re: avro + cassandra + ruby

2010-09-29 Thread Gary Dusbabek
We have a system test that tests this (in avro python).  see
test/system/test_avro_standard.py:TestStandardOperations.test_multiget_slice_simple.

On Wed, Sep 29, 2010 at 01:06, Gabor Torok  wrote:
> Hi,
> I'm attempting to use avro to talk to cassandra because the ruby thrift 
> client's read performance is pretty bad (I measured 4x slower than java).
>
> However, I run into a problem when calling multiget_slice.
> The server gives a KeyspaceNotDefinedException because 
> clientState.getKeyspace() returns null.
> It seems this is because ClientState stores the keyspace in a ThreadLocal.
>
> I call set_keyspace and clientState stores the keyspace value. I guess the 
> next avro call to multiget_slice runs in a different thread so it can't 
> retrieve the value.
>
> In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is 
> a stateless transport. I also tried SocketTransport, but that died with a 
> malloc exception.
>
> Is this a problem with the ruby avro library (I use avro 1.4.0), or how the 
> server handles avro threads?
> Any help would be appreciated!
>
> Thanks,
> --Gabor
>


Re: avro + cassandra + ruby

2010-09-28 Thread Gabor Torok
Thanks, that made things better by about 30%. Unfortunately for me that's still 
unacceptable... :-(

I feel like I'm doing something wrong with avro (see my original post). Was 
anyone able to make it work?


Re: avro + cassandra + ruby

2010-09-28 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok
 wrote:
> Hi,
> I'm attempting to use avro to talk to cassandra because the ruby thrift 
> client's read performance is pretty bad (I measured 4x slower than java).

Only 4x feels like a win. :)

One thing you should try is to make thrift use
BinaryProtocolAccelerated, rather than the pure-ruby implementation
(we should change the default).

-ryan

> However, I run into a problem when calling multiget_slice.
> The server gives a KeyspaceNotDefinedException because 
> clientState.getKeyspace() returns null.
> It seems this is because ClientState stores the keyspace in a ThreadLocal.
>
> I call set_keyspace and clientState stores the keyspace value. I guess the 
> next avro call to multiget_slice runs in a different thread so it can't 
> retrieve the value.
>
> In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is 
> a stateless transport. I also tried SocketTransport, but that died with a 
> malloc exception.
>
> Is this a problem with the ruby avro library (I use avro 1.4.0), or how the 
> server handles avro threads?
> Any help would be appreciated!
>
> Thanks,
> --Gabor
>


avro + cassandra + ruby

2010-09-28 Thread Gabor Torok
Hi,
I'm attempting to use avro to talk to cassandra because the ruby thrift 
client's read performance is pretty bad (I measured 4x slower than java).

However, I run into a problem when calling multiget_slice. 
The server gives a KeyspaceNotDefinedException because 
clientState.getKeyspace() returns null.
It seems this is because ClientState stores the keyspace in a ThreadLocal.

I call set_keyspace and clientState stores the keyspace value. I guess the next 
avro call to multiget_slice runs in a different thread so it can't retrieve the 
value.

In ruby, I use Avro::IPC::HTTPTransceiver as the transport which I believe is a 
stateless transport. I also tried SocketTransport, but that died with a malloc 
exception.

Is this a problem with the ruby avro library (I use avro 1.4.0), or how the 
server handles avro threads?
Any help would be appreciated!

Thanks,
--Gabor