[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319421#comment-17319421 ] Andres de la Peña commented on CASSANDRA-16577: --- It seems there is a compile error in the patch for 3.0, the new {{MigrationCoordinator#removeVersionInfoForEndpoint}} method should probably have an {{InetAddress}} argument, instead of the {{InetAddressAndPort}} used in {{trunk}}. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318333#comment-17318333 ] Brandon Williams commented on CASSANDRA-16577: -- Same [patch|https://github.com/driftx/cassandra/tree/CASSANDRA-16577-3.0] but slightly different location for 3.0/3.11. [!https://ci-cassandra.apache.org/job/Cassandra-devbranch/603/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/603/pipeline] > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318139#comment-17318139 ] Brandon Williams commented on CASSANDRA-16577: -- [!https://ci-cassandra.apache.org/job/Cassandra-devbranch/601/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/601/pipeline] Nits addressed, Blake's test added. Will commit if this run is good. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318136#comment-17318136 ] Blake Eggleston commented on CASSANDRA-16577: - Fix looks good, although it would be good to add a test to {{MigrationCoordinatorTest}} exercising {{removeVersionInfoForEndpoint}}. I think you could just add a variant of {{versionsAreSignaledWhenDeleted}}: {code} /** * If an endpoint is removed and no other endpoints are reporting its * schema version, the version should be removed and, we should signal * anyone waiting on that version */ @Test public void versionsAreSignaledWhenEndpointsRemoved() { InstrumentedCoordinator coordinator = new InstrumentedCoordinator(); coordinator.reportEndpointVersion(EP1, V1); WaitQueue.Signal signal = coordinator.getVersionInfoUnsafe(V1).register(); Assert.assertFalse(signal.isSignalled()); coordinator.removeVersionInfoForEndpoint(EP1); Assert.assertNull(coordinator.getVersionInfoUnsafe(V1)); Assert.assertTrue(signal.isSignalled()); } {code} > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownH
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318121#comment-17318121 ] Andres de la Peña commented on CASSANDRA-16577: --- Looks good to me assuming CI looks good, +1. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317992#comment-17317992 ] Brandon Williams commented on CASSANDRA-16577: -- Updated [detst|https://github.com/driftx/cassandra-dtest/tree/CASSANDRA-16577] and [patch|https://github.com/driftx/cassandra/tree/CASSANDRA-16577]. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317980#comment-17317980 ] Andres de la Peña commented on CASSANDRA-16577: --- Confirmed that the dtest nicely reproduces the failure for trunk, but I think that for 3.0 and 3.11 it fails earlier than expected due to the usage of {{--force}} in the call to {{nodetool decomission}}. This option doesn't exist in those branches, since it was added only for 4.0 by CASSANDRA-12510. Also, we could add a {{@jira_ticket}} reference to this ticket in the docstring for the new dtest. I'm still looking at the fix itself and the affected dtests. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317959#comment-17317959 ] Brandon Williams commented on CASSANDRA-16577: -- This broke some dtests, will post a new revision soon, and fix your nit. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317944#comment-17317944 ] Jan Karlsson commented on CASSANDRA-16577: -- Tried to reproduce with your patch. It worked both on 4.0-rc1-SNAPSHOT and 3.11.11-SNAPSHOT. As for the code, LGTM. One nit would be to include the ignore log message into the exception instead of the warning. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317513#comment-17317513 ] Brandon Williams commented on CASSANDRA-16577: -- [Patch|https://github.com/driftx/cassandra/tree/CASSANDRA-16577] takes the approach of removing the node from versions when it's removed from the cluster. Also removes what I believe is an errant call in onJoin that should already be handled in the onChange call just above it which filters dead states, where this one does not. [!https://ci-cassandra.apache.org/job/Cassandra-devbranch/594/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/594/pipeline] > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317350#comment-17317350 ] Brandon Williams commented on CASSANDRA-16577: -- bq. It's possible the down node has a table with data the new node should replicate It's possible that node then gets removed, too. Is the correct solution for this ticket to also skip the schema check? > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317316#comment-17317316 ] Blake Eggleston commented on CASSANDRA-16577: - That's by design. It's possible the down node has a table with data the new node should replicate, and coming up without its schema would lose data. Since C* can't determine if the unreachable schema is newer than the other schemas we have, the operator needs to skip the schema check on bootstrap by setting {{-Dcassandra.skip_schema_check=true}}, which we should include in the log message. It might also be good to add config options for skipping specific schema versions / endpoints. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317296#comment-17317296 ] Brandon Williams commented on CASSANDRA-16577: -- I started making a patch to remove the endpoints from MigrationCoordinator on node removal, but I think there are larger problems here. At least another is that MC isn't taking into account whether nodes are up or down, if a schema changed is performed with a node down you can't bootstrap until it comes back up. /cc [~bdeggleston] > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16577) Node waits for schema agreement on removed nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317241#comment-17317241 ] Brandon Williams commented on CASSANDRA-16577: -- dtest to repro [here|https://github.com/driftx/cassandra-dtest/tree/CASSANDRA-16577]. > Node waits for schema agreement on removed nodes > > > Key: CASSANDRA-16577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16577 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Bootstrap and Decommission >Reporter: Jan Karlsson >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait > for schema agreement from nodes that have been removed if token allocation > for keyspace is enabled. > > It is fairly easy to reproduce with the following steps: > {noformat} > // Create 3 node cluster > ccm create test --vnodes -n 3 -s -v 3.11.10 > // Remove two nodes > ccm node2 decommission > ccm node3 decommission > ccm node2 remove > ccm node3 remove > // Create keyspace to change the schema. It works if the schema never changes. > ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': 1};" > // Add allocate parameter > ccm updateconf 'allocate_tokens_for_keyspace: k' > // Add node2 again to cluster > ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 > ccm node2 start{noformat} > > This will cause node2 to throw exception on startup: > {noformat} > WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are > nodes in the cluster with a different schema version than us we did not > merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), > outstanding versions -> endpoints : > {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} > ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception > encountered during startup > java.lang.RuntimeException: Didn't receive schemas for all known versions > within the timeout > at > org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) > [apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) > [apache-cassandra-3.11.10.jar:3.11.10] > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 > HintsService.java:209 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 > - No local state, state is in silent shutdown, or node hasn't joined, not > announcing shutdown > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 > MessagingService.java:985 - Waiting for messaging service to quiesce > INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 > - MessagingService has terminated the accept() thread > INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 > HintsService.java:209 - Paused hints dispatch{noformat} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org