RE: RE: Very new user needs some troubleshooting pointers

Mark Jones Fri, 09 Apr 2010 09:27:42 -0700

I'm seeing an average write time of 20-30ms/insert with between the 60-67 
million row point.
(I think at this point I was actually running 80 threads simultaneously, 2 40 
thread clients).

From: Heath Oderman [mailto:[email protected]]
Sent: Friday, April 09, 2010 11:23 AM
To: [email protected]
Subject: Re: RE: Very new user needs some troubleshooting pointers

What's interesting for my case is that I put a timer around the thrift method 
to insert_batch

Every iteration of that call against debian (any hardware, same network or in 
amazon cloud with windows machine in ec2 as well) takes 400,000 ticks.  Super 
consistent.  One thread.

My friends setup with cassandra on osx takes 400,000 ticks for the first 
insert, vthen drops to 20,000 ticks for every consecutive call.

That's what is so strange.
On Apr 9, 2010 12:15 PM, "Mark Jones" 
<[email protected]<mailto:[email protected]>> wrote:
Sounds like we are some experiencing the same problems. (I'm using 0.6RC1) I 
have a 3 node cluster with 8GB/machine (dual core CPU).  I'm peaking on inserts 
at about 6000-7000/second running 40 threads.  Separate spindles for commitlog 
and data.....

My read speed is atrocious, 800/sec sustained (starts off at 1800+/second and 
falls back to 800/sec).  Of course that is only if I read from the "correct" 
node.  Depending on the moment, 2 of the nodes will return 1-2/second instead 
of 800, and only one node will return 800/second.  And if I spread the reads 
across many nodes, all the performance drops.   nodetool loadbalance can change 
which node is the "golden" node, but I don't know why.  I have doubled the # of 
concurrent read threads and seen some performance improvement, (that was the 
last thing I tried, and eeked out another 150/second)

So much about Cassandra makes we WANT it to work, I mean look at the fact that 
all nodes are essentially equal, that it replicates from rack to rack, from DC 
to DC, now, if I could just make it perform.

My machines are basically idle (a large amount of IOWait, but the time is spent 
in the pending queue, vs the device svctime).  So far I've got little insight 
into what could be wrong, I've increased the key cache 10X using JConsole but 
the hit rate is still at times abysmal.

I'm writing 400-800 byte blobs with an 8 byte key (supercolumn) and a 12 byte 
"subkey", then a 5 byte column name, something that would seem to be right up 
Cassandra's alley.

Right now I'm reworking my test to dump it into MySQL on the same machines, so 
I can compare the two for speed, because either I've got crap for hardware, or 
there is something rotten in Denmark.

From: Heath Oderman [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, April 09, 2010 10:40 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: Very new user needs some troubleshooting pointers

Thanks for the reply Jonathan!

I started with multi threaded tests, but when my performance...

RE: RE: Very new user needs some troubleshooting pointers

Reply via email to