Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "Operations" page has been changed by BrandonWilliams: http://wiki.apache.org/cassandra/Operations?action=diff&rev1=105&rev2=106 To bootstrap a node, turn !AutoBootstrap on in the configuration file, and start it. If you explicitly specify an !InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will give it half the keys from the node with the most disk space used, that does not already have another node bootstrapping into its Range. + If you wish to enable vnodes, do not set the !InitialToken, but set the num_tokens parameter. 256 is the recommended setting. Important things to note: 1. You should wait long enough for all the nodes in your cluster to become aware of the bootstrapping node via gossip before starting another bootstrap. The new node will log "Bootstrapping" when this is safe, 2 minutes after starting. (90s to make sure it has accurate load information, and 30s waiting for other nodes to start sending it inserts happening in its to-be-assumed part of the token ring.) - 1. Relating to point 1, one can only bootstrap N nodes at a time with automatic token picking, where N is the size of the existing cluster. If you need to more than double the size of your cluster, you have to wait for the first N nodes to finish until your cluster is size 2N before bootstrapping more nodes. So if your current cluster is 5 nodes and you want add 7 nodes, bootstrap 5 and let those finish before bootstrapping the last two. + 1. Relating to point 1, one can only bootstrap N nodes at a time with automatic non-vnode token picking, where N is the size of the existing cluster. If you need to more than double the size of your cluster, you have to wait for the first N nodes to finish until your cluster is size 2N before bootstrapping more nodes. So if your current cluster is 5 nodes and you want add 7 nodes, bootstrap 5 and let those finish before bootstrapping the last two. 1. As a safety measure, Cassandra does not automatically remove data from nodes that "lose" part of their Token Range to a newly added node. Run `nodetool cleanup` on the source node(s) (neighboring nodes that shared the same subrange) when you are satisfied the new node is up and working. If you do not do this the old data will still be counted against the load on that node and future bootstrap attempts at choosing a location will be thrown off. + 1. During bootstrap, a node will not bind the Thrift port until finished. - 1. When bootstrapping a new node, existing nodes have to divide the key space before beginning replication. This can take awhile, so be patient. - 1. During bootstrap, a node will drop the Thrift port and will not be accessible from `nodetool`. 1. Bootstrap can take many hours when a lot of data is involved. See [[Streaming]] for how to monitor progress. Cassandra is smart enough to transfer data from the nearest source node(s), if your !EndpointSnitch is configured correctly. So, the new node doesn't need to be in the same datacenter as the primary replica for the Range it is bootstrapping into, as long as another replica is in the datacenter with the new one. - Bootstrap progress can be monitored using `nodetool` with the `netstats` argument (0.7 and later) or `streams` (Cassandra 0.6). + Bootstrap progress can be monitored using `nodetool` with the `netstats` argument. - During bootstrap `nodetool` may report that the new node is not receiving nor sending any streams, this is because the sending node will copy out locally the data they will send to the receiving one, which can be seen in the sending node through the the "AntiCompacting... AntiCompacted" log messages. + During bootstrap `nodetool` may report that the new node is not receiving nor sending any streams, in which case is may be building secondary indexes, visible in `compactionstats` == Moving or Removing nodes == === Removing nodes entirely ===