It should be simpler than that if the wiki is correct.  Search the list 
archives or wiki for the small script that calculates what the token should be 
for each node.  After that it should be as simple as using nodetool move on one 
node at a time.

On Sep 11, 2010, at 2:00 AM, aaron morton wrote:

See the section on Moving or Removing nodes here 
http://wiki.apache.org/cassandra/Operations you should also read the 
bootstrapping section as that is essentially what you are doing.

AFAIK you should use nodetool move to assign a new token to the nodes. As you 
say, loadbalance is not recommended.

The token for the node represents the end of it's token range, i.e. it's 
responsible for the data from the previous nodes token to it's. My guess of the 
best approach would be...

I'm think most of your keys are in the range managed by the 15 node. As the 128 
node has the same load and a much smaller key range. Use get_range_slices to 
have a look at the keys in your db, or use your knowledge of the keys you are 
generating. You need to understand if you are making lots of keys that start 
with "aaaa". It may help if you provide some more info on the keys.

If it is true that all the keys fall between wpt0w4Aomuhb8MQh and 
jnGTn7PwLTh6dxmC I would move  the 155 node to have a token that is about two 
thirds of the keys between those tokens. Then move the 239 node to have a token 
that is one third of the keys.

I'd let each node move complete first, watch the streams to see when it's done. 
Then when finished and everything is working, run nodetool cleanup on each node.

I've not actually done this before, I just wanted to think about the problem :) 
So I'd also wait for one of the adults around here to weigh in.

Aaron



On 11 Sep 2010, at 19:53, maneela a wrote:

we have a Cassandra set up running with 4 nodes with Reflicationfactor:2 and 
OrderPreservingPartitioner as partitioner but we have not provided InitialToken 
values.

Could some one suggest me what is best way to balance my cluster because some 
of user threads have suggested "do not ever run nodetool loadbalance" option. 
which node I should start running loadbalance command first before doing on 2nd 
node if that option suits for my scenario.


r...@ip-10-251-190-239:/etc/cassandra# nodetool -h localhost ring
Address       Status     Load          Range                                    
  Ring
                                       wpt0w4Aomuhb8MQh
10.202.87.15  Up         119.4 GB      jnGTn7PwLTh6dxmC                         
  |<--|
10.223.71.128 Up         119.82 GB     kopMmFKwbk1yZFNX                         
  |   |
10.251.190.239Up         2.56 KB       v8w434UBnDIJyrIe                         
  |   |
10.201.217.155Up         2.56 KB       wpt0w4Aomuhb8MQh                         
  |-->|
r...@ip-10-251-190-239:/etc/cassandra#





Reply via email to