Hi, Managing `initial_token` by yourself will give you more control over scale-in and scale-out. Let's say you have three node cluster with `num_token: 1`
And your initial range looks like:- Datacenter: datacenter1 ========== Address Rack Status State Load Owns Token 3074457345618258602 127.0.0.1 rack1 Up Normal 98.96 KiB 66.67% -9223372036854775808 127.0.0.2 rack1 Up Normal 98.96 KiB 66.67% -3074457345618258603 127.0.0.3 rack1 Up Normal 98.96 KiB 66.67% 3074457345618258602 Now let's say you want to scale out the cluster to twice the current throughput(means you are adding 3 more nodes) If you are using AWS EBS volumes then you can use the same volumes and spin three more nodes by selecting midpoints of existing ranges which means your new nodes are already having data. Once you have mounted volumes on your new nodes:- * You need to delete every system table except schema related tables. * You need to generate system/local table by yourself which has `Bootstrap state` as completed and schema-version same as other existing nodes. * You need to remove extra data on all the machines using cleanup commands This is how you can scale out Cassandra cluster in the minutes. In case you want to add nodes one by one then you need to write some small tool which will always figure out the bigger range in the existing cluster and will split it into the half. However, I never tested it thoroughly but this should work conceptually. So here we are taking advantage of the fact that we have volumes(data) for the new node beforehand so we no need to bootstrap them. Thanks & Regards, Varun Barala On Tue, Oct 2, 2018 at 2:31 PM onmstester onmstester <onmstes...@zoho.com> wrote: > > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > > ---- On Mon, 01 Oct 2018 18:36:03 +0330 *Alain RODRIGUEZ > <arodr...@gmail.com <arodr...@gmail.com>>* wrote ---- > > Hello again :), > > I thought a little bit more about this question, and I was actually > wondering if something like this would work: > > Imagine 3 node cluster, and create them using: > For the 3 nodes: `num_token: 4` > Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2, > 4611686018427387901` > Node 2: `intial_token: -7686143364045646507, -3074457345618258604, > 1537228672809129299, 6148914691236517202` > Node 3: `intial_token: -6148914691236517206, -1537228672809129303, > 3074457345618258600, 7686143364045646503` > > If you know the initial size of your cluster, you can calculate the total > number of tokens: number of nodes * vnodes and use the formula/python > code above to get the tokens. Then use the first token for the first node, > move to the second node, use the second token and repeat. In my case there > is a total of 12 tokens (3 nodes, 4 tokens each) > ``` > >>> number_of_tokens = 12 > >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in > range(number_of_tokens)] > ['-9223372036854775808', '-7686143364045646507', '-6148914691236517206', > '-4611686018427387905', '-3074457345618258604', '-1537228672809129303', > '-2', '1537228672809129299', '3074457345618258600', '4611686018427387901', > '6148914691236517202', '7686143364045646503'] > ``` > > > Using manual initial_token (your idea), how could i add a new node to a > long running cluster (the procedure)? > >