[ 
https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201025#comment-17201025
 ] 

Paulo Motta commented on CASSANDRA-8494:
----------------------------------------

Dynamic virtual nodes (CASSANDRA-16141) will make it trivial to support 
incremental bootstrap. The idea is similar to [~rustyrazorblade] suggestion on 
[this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-8494?focusedCommentId=14264970&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14264970]:
 a node will bootstrap one token at a time and announce to the cluster that 
token is ready to receive requests before bootstrapping the next token. The 
pseudo-code is available 
[here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-incremental_bootstrap-py].

> incremental bootstrap
> ---------------------
>
>                 Key: CASSANDRA-8494
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Legacy/Streaming and Messaging
>            Reporter: Jon Haddad
>            Assignee: Yuki Morishita
>            Priority: Low
>              Labels: dense-storage
>             Fix For: 4.x
>
>
> Current bootstrapping involves (to my knowledge) picking tokens and streaming 
> data before the node is available for requests.  This can be problematic with 
> "fat nodes", since it may require 20TB of data to be streamed over before the 
> machine can be useful.  This can result in a massive window of time before 
> the machine can do anything useful.
> As a potential approach to mitigate the huge window of time before a node is 
> available, I suggest modifying the bootstrap process to only acquire a single 
> initial token before being marked UP.  This would likely be a configuration 
> parameter "incremental_bootstrap" or something similar.
> After the node is bootstrapped with this one token, it could go into UP 
> state, and could then acquire additional tokens (one or a handful at a time), 
> which would be streamed over while the node is active and serving requests.  
> The benefit here is that with the default 256 tokens a node could become an 
> active part of the cluster with less than 1% of it's final data streamed over.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to