[ https://issues.apache.org/jira/browse/CASSANDRA-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Coli updated CASSANDRA-6648: ----------------------------------- Since Version: 1.2.14 > Race condition during node bootstrapping > ---------------------------------------- > > Key: CASSANDRA-6648 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6648 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Sergio Bossa > Assignee: Sergio Bossa > Priority: Critical > Fix For: 1.2.15, 2.0.5 > > Attachments: 6648-v2.txt, 6648-v3-1.2.txt, 6648-v3.txt, > CASSANDRA-6648.patch > > > When bootstrapping a new node, data is "missing" as if the new node didn't > actually bootstrap, which I tracked down to the following scenario: > 1) New node joins token ring and waits for schema to be settled before > actually bootstrapping. > 2) The schema scheck somewhat passes and it starts bootstrapping. > 3) Bootstrapping doesn't find the ks/cf that should have received from the > other node. > 4) Queries at this point cause NPEs, until when later they "recover" but data > is missed. > The problem seems to be caused by a race condition between the migration > manager and the bootstrapper, with the former running after the latter. > I think this is supposed to protect against such scenarios: > {noformat} > while (!MigrationManager.isReadyForBootstrap()) > { > setMode(Mode.JOINING, "waiting for schema information to > complete", true); > Uninterruptibles.sleepUninterruptibly(1, TimeUnit.SECONDS); > } > {noformat} > But MigrationManager.isReadyForBootstrap() implementation is quite fragile > and doesn't take into account "slow" schema propagation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)