Are you using Virtual Machines to run Cassandra? Ive found that performance in VMs is crap
Nicolas Santini On Thu, Feb 3, 2011 at 11:17 PM, aaron morton <aa...@thelastpickle.com>wrote: > This page has a guide to setting the initial tokens for the nodes > http://wiki.apache.org/cassandra/Operations#Ring_management > > <http://wiki.apache.org/cassandra/Operations#Ring_management>You can also > use the bin/nodetool cfstats command or JConsole to check the maximum row > size in each node, to see if you have a monster row. > > Aaron > > On 3/02/2011, at 10:22 PM, abhinav prakash rai wrote: > > Hi Peter, > > Thanks for your reply. > > Our application is multi-threaded. we are using 8 core machine. In our > application we are using 4 column families out of which one column family is > containing rows whose size is huge relative to size of the rows in other > column families. > > In the ring the balance is highly skewed.Can you suggest we can insure even > balancing of the load across the cluster? > > The rows id in one column family is combination of cell numbers ( ie > 9883240354_9885430354 ) and other row id's are like thread_name_12234 etc. > > How to insure spreading the data across rows? > > Thanks & Regards, > abhinav > > > > > > On Thu, Feb 3, 2011 at 1:46 PM, Peter Schuller < > peter.schul...@infidyne.com> wrote: > >> > First time I tun single instance of Cassandra and my application on a >> system >> > (16GB ram and 8 core), the time taken was 480sec. >> > When I added one more system ,(means this time I was running 2 instance >> > of Cassandra in cluster) and running application from single client , I >> > found time taken in increased to 1000sec. And I also found that that >> data >> > distribution was also very odd on both system (in one system data were >> about >> > 2.5GB and another were 140MB). >> > Is any configuration require while running Cassandra in a cluster other >> than >> > adding seeds ? >> >> For starters: >> >> (1) Are you spreading your data around evenly across row? Rows >> determine where data is placed in the cluster. >> (2) Is your ring actually balanced? (nodetool ring, they should have >> 50/50) >> (3) Is your test concurrent/multi-threaded? Increasing total time >> would be expected if you're moving from local traffic only to running >> against remote machines, if your test is a sequential workload. >> Adding machines increases aggregate throughput across multiple >> clients; it won't make individual requests faster (except indirectly >> of course by avoiding overloaded conditions). >> >> >> -- >> / Peter Schuller >> > > > > -- > Regards, > Abhinav P. Rai > > >