Just to follow this up, I repeated the test with a multi-threaded java (Hector) 
client and was able to get much better performance - 10,000 rows in just over a 
second. So it looks like the client latency was the killer and I have since 
read that the ruby thrift implementation is not the fastest.

On Apr 4, 2012, at 9:11 AM, Jeff Williams wrote:

> On three machines on the same subnet as the two cassandra nodes.
> 
> On Apr 3, 2012, at 6:40 PM, Collard, David L (Dave) wrote:
> 
>> Where is your client running?
>> 
>> -----Original Message-----
>> From: Jeff Williams [mailto:je...@wherethebitsroam.com] 
>> Sent: Tuesday, April 03, 2012 11:09 AM
>> To: user@cassandra.apache.org
>> Subject: Re: Write performance compared to Postgresql
>> 
>> Vitalii,
>> 
>> Yep, that sounds like a good idea. Do you have any more information about 
>> how you're doing that? Which client?
>> 
>> Because even with 3 concurrent client nodes, my single postgresql server is 
>> still out performing my 2 node cassandra cluster, although the gap is 
>> narrowing.
>> 
>> Jeff
>> 
>> On Apr 3, 2012, at 4:08 PM, Vitalii Tymchyshyn wrote:
>> 
>>> Note that having tons of TCP connections is not good. We are using async 
>>> client to issue multiple calls over single connection at same time. You can 
>>> do the same.
>>> 
>>> Best regards, Vitalii Tymchyshyn.
>>> 
>>> 03.04.12 16:18, Jeff Williams написав(ла):
>>>> Ok, so you think the write speed is limited by the client and protocol, 
>>>> rather than the cassandra backend? This sounds reasonable, and fits with 
>>>> our use case, as we will have several servers writing. However, a bit 
>>>> harder to test!
>>>> 
>>>> Jeff
>>>> 
>>>> On Apr 3, 2012, at 1:27 PM, Jake Luciani wrote:
>>>> 
>>>>> Hi Jeff,
>>>>> 
>>>>> Writing serially over one connection will be slower. If you run many 
>>>>> threads hitting the server at once you will see throughput improve.
>>>>> 
>>>>> Jake
>>>>> 
>>>>> 
>>>>> 
>>>>> On Apr 3, 2012, at 7:08 AM, Jeff Williams<je...@wherethebitsroam.com>  
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am looking at cassandra for a logging application. We currently log to 
>>>>>> a Postgresql database.
>>>>>> 
>>>>>> I set up 2 cassandra servers for testing. I did a benchmark where I had 
>>>>>> 100 hashes representing logs entries, read from a json file. I then 
>>>>>> looped over these to do 10,000 log inserts. I repeated the same writing 
>>>>>> to a postgresql instance on one of the cassandra servers. The script is 
>>>>>> attached. The cassandra writes appear to perform a lot worse. Is this 
>>>>>> expected?
>>>>>> 
>>>>>> jeff@transcoder01:~$ ruby cassandra-bm.rb
>>>>>> cassandra
>>>>>> 3.170000   0.480000   3.650000 ( 12.032212)
>>>>>> jeff@transcoder01:~$ ruby cassandra-bm.rb
>>>>>> postgres
>>>>>> 2.140000   0.330000   2.470000 (  7.002601)
>>>>>> 
>>>>>> Regards,
>>>>>> Jeff
>>>>>> 
>>>>>> <cassandra-bm.rb>
>>> 
>> 
> 

Reply via email to