[ 
https://issues.apache.org/jira/browse/CASSANDRA-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2435:
--------------------------------------

          Component/s: Core
             Priority: Major  (was: Minor)
    Affects Version/s:     (was: 0.7.2)
                       0.7.0
        Fix Version/s: 0.7.5
             Assignee: Jonathan Ellis

recall that move (until 0.8) consists of

- unbootstrap
- bootstrap to new location

unbootstrap calls storageservice.leavering (same as decommission), which marks 
the node as not-bootstrapped with setBootstrapped(false).  

in one of the refactorings during 0.7 development we removed the call to 
setBootstrapped(true) from finishBootstrapping.  So next restart it will indeed 
autobootstrap if that is enabled in the config file.

> auto bootstrap happened on already bootstrapped nodes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-2435
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2435
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Peter Schuller
>            Assignee: Jonathan Ellis
>             Fix For: 0.7.5
>
>
> I believe the following was observed on 0.7.2. I meant to dig deeper, but 
> never had the time, and now I want to at least file this even if I don't have 
> extremely helpful information.
> A piece of background is that we consciously made the decision to have the 
> default configuration on nodes have auto_bootstrap set to true. The logic was 
> that if one accidentally were to start a new node, we'd rather have it join 
> with data than join *without* data and cause bogus read results in the 
> cluster.
> We executed this policy (by way of having the puppet managed config have 
> auto_bootstrap set to true).
> On one of our clusters with 5 nodes, we did some moves. All looked well; the 
> moves completed. For unrelated reasons, we wanted to restart nodes after they 
> had been moved. When we did, three of the 5, specifically those 3 that were 
> *NOT* seed nodes, initiated a bootstrap procedure! Before the moves the 
> cluster had been running for several days at least.
> The logs indicated the automatic token selection, and they joined the ring 
> under a new automatically selected token.
> Presumably, this violated consistency but at the time there was no live 
> traffic to the cluster and we didn't confirm (put traffic on it after 
> repair+cleanup).
> I did look a little bit at the code in light of this but didn't see anything 
> obvious, so I don't really know what the likely culprit is.
> A potential complication was that seed nodes were moved without using the 
> correct procedure of de-seeding them first. This was clearly wrong, but it is 
> not obvious to me that it would cause other nodes to incorrectly bootstrap 
> since a node should *never* bootstrap more than once if the local system 
> tables say it's been bootstrapped.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to