[ https://issues.apache.org/jira/browse/CASSANDRA-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018072#comment-13018072 ]
Peter Schuller commented on CASSANDRA-2435: ------------------------------------------- FWIW, looks good to me (but I only did visual inspection and some code jumping in the 0.7 branch; haven't tested it). > auto bootstrap happened on already bootstrapped nodes > ----------------------------------------------------- > > Key: CASSANDRA-2435 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2435 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.7.0 > Reporter: Peter Schuller > Assignee: Jonathan Ellis > Priority: Critical > Fix For: 0.7.5 > > Attachments: 2435.txt > > > I believe the following was observed on 0.7.2. I meant to dig deeper, but > never had the time, and now I want to at least file this even if I don't have > extremely helpful information. > A piece of background is that we consciously made the decision to have the > default configuration on nodes have auto_bootstrap set to true. The logic was > that if one accidentally were to start a new node, we'd rather have it join > with data than join *without* data and cause bogus read results in the > cluster. > We executed this policy (by way of having the puppet managed config have > auto_bootstrap set to true). > On one of our clusters with 5 nodes, we did some moves. All looked well; the > moves completed. For unrelated reasons, we wanted to restart nodes after they > had been moved. When we did, three of the 5, specifically those 3 that were > *NOT* seed nodes, initiated a bootstrap procedure! Before the moves the > cluster had been running for several days at least. > The logs indicated the automatic token selection, and they joined the ring > under a new automatically selected token. > Presumably, this violated consistency but at the time there was no live > traffic to the cluster and we didn't confirm (put traffic on it after > repair+cleanup). > I did look a little bit at the code in light of this but didn't see anything > obvious, so I don't really know what the likely culprit is. > A potential complication was that seed nodes were moved without using the > correct procedure of de-seeding them first. This was clearly wrong, but it is > not obvious to me that it would cause other nodes to incorrectly bootstrap > since a node should *never* bootstrap more than once if the local system > tables say it's been bootstrapped. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira