Are you using Virtual Machines to run Cassandra? Ive found that performance
in VMs is crap

Nicolas Santini


On Thu, Feb 3, 2011 at 11:17 PM, aaron morton <aa...@thelastpickle.com>wrote:

> This page has a guide to setting the initial tokens for the nodes
> http://wiki.apache.org/cassandra/Operations#Ring_management
>
> <http://wiki.apache.org/cassandra/Operations#Ring_management>You can also
> use the bin/nodetool cfstats command or JConsole to check the maximum row
> size in each node, to see if you have a monster row.
>
> Aaron
>
> On 3/02/2011, at 10:22 PM, abhinav prakash rai wrote:
>
> Hi Peter,
>
> Thanks for your reply.
>
> Our application is multi-threaded. we are using 8 core machine. In our
> application we are using 4 column families out of which one column family is
> containing rows whose size is huge relative to size of the rows in other
> column families.
>
> In the ring the balance is highly skewed.Can you suggest we can insure even
> balancing of the load across the cluster?
>
> The rows id in one column family is combination of cell numbers ( ie
> 9883240354_9885430354 ) and other row id's are like thread_name_12234 etc.
>
> How to insure spreading the data across rows?
>
> Thanks & Regards,
> abhinav
>
>
>
>
>
> On Thu, Feb 3, 2011 at 1:46 PM, Peter Schuller <
> peter.schul...@infidyne.com> wrote:
>
>> > First time I tun single instance of Cassandra and my application on a
>> system
>> > (16GB ram and 8 core), the time taken was 480sec.
>> > When I added one more system ,(means this time I was running 2 instance
>> > of Cassandra in cluster) and running application from single client , I
>> > found time taken in increased to 1000sec.   And I also found that that
>> data
>> > distribution was also very odd on both system (in one system data were
>> about
>> > 2.5GB and another were 140MB).
>> > Is any configuration require while running Cassandra in a cluster other
>> than
>> > adding seeds ?
>>
>> For starters:
>>
>> (1) Are you spreading your data around evenly across row? Rows
>> determine where data is placed in the cluster.
>> (2) Is your ring actually balanced? (nodetool ring, they should have
>> 50/50)
>> (3) Is your test concurrent/multi-threaded? Increasing total time
>> would be expected if you're moving from local traffic only to running
>> against remote machines,  if your test is a sequential workload.
>> Adding machines increases aggregate throughput across multiple
>> clients; it won't make individual requests faster (except indirectly
>> of course by avoiding overloaded conditions).
>>
>>
>> --
>> / Peter Schuller
>>
>
>
>
> --
> Regards,
> Abhinav P. Rai
>
>
>

Reply via email to