You can have a look at org.apache.cassandra.service.StorageService
    public void initServer() throws IOException

1. If AutoBootstrap=false, it means the the node is bootstaped (not a new
node)
Usually, the first new node is set false.
(1) check the system table to find the saved token, if found use it,
otherwise,
(2) check config of InitialToken, if configured use it, otherwise,
(3) getRandomToken
Please refer
   org.apache.cassandra.service.StorageService
         public void initServer() throws IOException
and
    org.apache.cassandra.db.SystemTable
         public static synchronized StorageMetadata initMetadata() throws
IOException

2. If AutoBootstrap=true, it means the the node is a new node.
    Usually, the other new node set AutoBootstrap=true.
(1) If the seed include this node itself, go above 1. otherwise,
(2) If the node is already boodstraped (check system table....), go above 1.
otherwise,
(3) Get load information of other nodes via Gossip, wait long.
(4) If InitialTokenis configured, use it. otherwise,
(5) Find the node token with most heavy load.....

I my use case, I usually always configure InitialToken for new node for a
new cluster, then, I can get good load-balance. But when adding a new node
to a running cluster (with many data), I let cassandra to find the token via
load-checking.

Schubert


On Tue, Apr 20, 2010 at 7:48 AM, Anthony Molinaro <
antho...@alumni.caltech.edu> wrote:

>
> On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
> > > Can I then 'nodeprobe move <token for range I want to take over>', and
> > > achieve the same as step 2 above?
> >
> > You can't have two nodes with the same token in the ring at once.  So,
> > you can removetoken the old node first, then bootstrap the new one
> > (just specify InitialToken in the config to avoid having it guess
> > one), or you can make it a 3 step process (bootstrap, remove, move) to
> > avoid transferring so much data around.
>
> So I'm still a little fuzzy for your 3 step case on why less data moves,
> but let me run through the two scenarios and see where we get.  Please
> correct me if I'm wrong on some point.
>
> Let say I have 3 nodes with random partitioner and rack unaware strategy.
> Which means I have something like
>
> Node  Size   Token  KeyRange (self + next in ring)
> ----  ----   -----  ------------------------------
> A     5 G      33    1 -> 66
> B     6 G      66       34 -> 0
> C     2 G       0          67 -> 33
>
> Now lets say Node B is giving us some problems, so we want to replace it
> with another node D.
>
> We've outlined 2 processes.
>
> In the first process you recommend
>
> 1. removetoken on node B
> 2. wait for data to move
> 3. add InitialToken of 66 and AutoBootstrap = true to node D
> storage-conf.xml
>   then start it
> 4. wait for data to move
>
> So when you do the removetoken, this will cause the following transfers
> at stage 2
>  Node A sends 34->66 to Node C
>  Node C sends 67->0  to Node A
> at stage 4
>  Node A sends 34->66 to Node D
>  Node C sends 67->0  to Node D
>
> In the second process I assume you pick a token really close to another
> token?
>
> 1. add InitialToken of 34 and AutoBootstrap to true to node D
> storage-conf.xml
>   then start it
> 2. wait for data to move
> 3. removetoken on node B
> 4. wait for data to move
> 5. movetoken on node D to 66
> 6. wait for data to move
>
> This results in the following moves
> at stage 2
>  Node A/B sends 33->34 to Node D (primary token range)
>  Node B sends 34->66 to Node D   (replica range)
> at stage 4
>  Node C sends 66->0 to Node D (replica range)
> at stage 6
>  No data movement as D already had 33->0
>
> So seems like you move all the data twice for process 1 and only a small
> portion twice for process 2 (which is what you said, so hopefully I've
> outlined correctly what is happening).  Does all that sound right?
>
> Once I've run bootstrap with the InitialToken value set in the config is
> it then ignored in subsequent restarts, and if so can I just remove it
> after that first time?
>
> Thanks,
>
> -Anthony
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <antho...@alumni.caltech.edu>
>

Reply via email to