
we sorted out the performance problems and tuned the cluster. In
particular, we identified the following weak spot in our setup:
ConcurrentReads and ConcurrentWrites was set to the default values
which were much too low for our setup. Now, we get some serious



On Tue, Feb 15, 2011 at 9:09 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Initial thoughts are you are overloading the cluster, are their any log lines 
> about dropping messages?
> What is the schema, what settings do you have in Cassandra yaml  and what are 
> CF stats telling you? E.g. Are you switching Memtables too quickly? What are 
> the write latency numbers?
> Also 0.7 is much faster.
> Aaron
> On 16/02/2011, at 8:59 AM, Thibaut Britz <thibaut.br...@trendiction.com> 
> wrote:
>> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
>> What's your CPU usage during these tests?
>> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <mar...@klems.eu> wrote:
>>> Hi there,
>>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>>> High-Mem Quadruple Extra Large EC2 nodes
>>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>>> We also set the user count to a very large number via ulimit -u
>>> 999999.
>>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>>> parameter (i.e. the number of parallel benchmarking client threads).
>>> However, this does only improve Cassandra throughput for low numbers
>>> of threads. If we move to higher threadcounts, throughput does not
>>> increase and even  decreases. Do you have any idea why this is
>>> happening and possibly suggestions how to scale throughput to much
>>> higher numbers? Why is throughput hitting a wall, anyways? And where
>>> does the latency/throughput tradeoff come from?
>>> Here is our YCSB configuration:
>>> recordcount=300000
>>> operationcount=1000000
>>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>>> readallfields=true
>>> readproportion=0.5
>>> updateproportion=0.5
>>> scanproportion=0
>>> insertproportion=0
>>> threadcount= 500
>>> target = 10000
>>> hosts=EC2-1,EC2-2,EC2-3
>>> requestdistribution=uniform
>>> These are typical results for threadcount=1:
>>> Loading workload...
>>> Starting test.
>>>  0 sec: 0 operations;
>>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>>> These are typical results for threadcount=10:
>>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>>> These are typical results for threadcount=100:
>>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>>> These are typical results for threadcount=500:
>>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>>> We never measured more than ~6000 ops/sec. Are there ways to tune
>>> Cassandra that we are not aware of? We made some modification to the
>>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>>> issues, we might consider to port our modifications to a newer
>>> Cassandra version...
>>> Thanks,
>>> Markus Klems
>>> Karlsruhe Institute of Technology, Germany

Reply via email to