[jira] [Commented] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114919#comment-17114919 ] YCozy commented on CASSANDRA-15795: --- [~jasonstack] Sorry, I checked my cqlsh history, and found that my replication_factor is 2. This time I use your commands, except changing replication_factor to 2. In my cluster, node 1 and node 3 store the data replica. After stopping node 2 and node 3, I get NoHostAvailable when select * from ks.cf on node 1. > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '2'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15795: -- Description: I start up a 3 nodes cluster, and write a row with 'replication_factor' : '2'. The consistency level is ONE. Then I kill two nodes, and try to get the row that I just inserted by cqlsh. But cqlsh returns NoHostAvailable. I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. was: I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. The consistency level is ONE. Then I kill two nodes, and try to get the row that I just inserted by cqlsh. But cqlsh returns NoHostAvailable. I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '2'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15795: -- Description: I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. The consistency level is ONE. Then I kill two nodes, and try to get the row that I just inserted by cqlsh. But cqlsh returns NoHostAvailable. I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. was: I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. The consistency level is ONE. Then I kill two nodes, and try to get the row that I just inserted by cqlsh. But cqlsh returns NoHostAvailable. My Cassandra version is 3.11.6. > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Coordination >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '3'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101560#comment-17101560 ] YCozy commented on CASSANDRA-15795: --- Kishan reported a similar issue. The test in 11804 is flaky, while I can always successfully reproduce this issue. And Kishan said 'I was able to repro on C* 3.5 but not in C* 3.6', but I find this bug in 3.11.6. So maybe it's a new issue? > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Coordination >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '3'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > My Cassandra version is 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
YCozy created CASSANDRA-15795: - Summary: Cannot read data from a 3-node cluster which has two nodes down Key: CASSANDRA-15795 URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 Project: Cassandra Issue Type: Bug Components: Cluster/Gossip, Consistency/Coordination Reporter: YCozy I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. The consistency level is ONE. Then I kill two nodes, and try to get the row that I just inserted by cqlsh. But cqlsh returns NoHostAvailable. My Cassandra version is 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request
YCozy created CASSANDRA-15758: - Summary: ERROR when a disconnected Cassandra node comes back and receives a drop/add column request Key: CASSANDRA-15758 URL: https://issues.apache.org/jira/browse/CASSANDRA-15758 Project: Cassandra Issue Type: Bug Reporter: YCozy We got the following error when we were dropping a column in the table: {code:java} ERROR [MigrationStage:1] 2020-04-24 00:07:54,995 SchemaKeyspace.java:1021 - No partition columns found for table ks_name.tbl_name in system_schema.columns. This may be due to corruption or concurrent dropping and altering of a table. If this table is supposed to be dropped, restart cassandra with -Dcassandra.ignore_corrupted_schema_tables=true and run the following query to cleanup: "DELETE FROM system_schema.tables WHERE keyspace_name = 'ks_name' AND table_name = 'tbl_name'; DELETE FROM system_schema.columns WHERE keyspace_name = 'ks_name' AND table_name = 'tbl_name';" If the table is not supposed to be dropped, restore system_schema.columns sstables from backups. ERROR [MigrationStage:1] 2020-04-25 15:21:55,716 CassandraDaemon.java:228 - Exception in thread Thread[MigrationStage:1,5,main] org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: Columns not found in schema table for ks_name.tbl_name at org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1100) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesOnly(SchemaKeyspace.java:951) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1401) ~[main/:na] at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380) ~[main/:na] at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51) ~[main/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_242] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_242] at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) [main/:na] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242] {code} We analyzed the logs and came up with the following theory of what happened: # We have a cluster of three nodes (C1, C2, C3). # Right after we start all the nodes, C3 is partitioned away from the other. As a result, neither C1 or C2 knows that C3 exists. # User contacts C1 to create a keyspace "ks_name" and a table "tbl_name". C1 and C2 serve the requests. Since they don't know about C3, they think the schema is consistent across the cluster. Both the keyspace and the table are created successfully without warning. # User tries to drop a column in the table. Now C3 reconnects and receives the drop column request from C1 (the coordinator node). However, it does not know about "ks_name" nor "tbl_name". So it throws the above error. # If the user tries to add a column instead of dropping one, the same error will occur. Since network partition is inevitable in deployed clusters, we think Cassandra should better handle such a scenario. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11804) Read request at proper CL returns ReadTimeout occasionally
[ https://issues.apache.org/jira/browse/CASSANDRA-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066764#comment-17066764 ] YCozy commented on CASSANDRA-11804: --- We saw a similar issue in 3.11.5 when querying data using cqlsh. We had a three node cluster, and got the following error when two nodes are partitioned away: {code:java} :1:ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}{code} > Read request at proper CL returns ReadTimeout occasionally > -- > > Key: CASSANDRA-11804 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11804 > Project: Cassandra > Issue Type: Bug > Environment: C* 3.4 | Ruby-driver 3.0-rc2 >Reporter: Kishan Karunaratne >Priority: Normal > Fix For: 3.11.x > > > I have a 3-node cluster with a keyspace with RF=3, with some data inserted. > I'm using a DowngradingConsistency retry policy on the client. Performing a > query at CL=ALL with one node blocked (ccm pause/SIGSTOP), the query fails > and returns a ReadTimeout to the client as expected. The following is in the > debug log: > {noformat} > ReadCallback.java:126 - Timed out; received 2 of 3 responses (including data) > {noformat} > Now, the driver automatically retries once, and in this case at QUORUM. The > client receives a ReadTimeout once more: > "Cassandra::Errors::ReadTimeoutError: Operation timed out - received only 1 > responses." This ReadTimeout should not have happened because we still have 2 > good nodes/hosts to retrieve data from. This is in the debug log: > {noformat} > ReadCallback.java:126 - Timed out; received 1 of 2 responses (including data) > {noformat} > The weird part is that the query does occasionally succeed (at QUORUM), I'd > say < 50% of the time. I did the same experiment with two nodes blocked and I > get "Cassandra::Errors::ReadTimeoutError: Operation timed out - received only > 0 responses.": > {noformat} > ReadCallback.java:126 - Timed out; received 1 of 3 responses (including data) > ReadCallback.java:126 - Timed out; received 0 of 1 responses > {noformat} > I expect this query to have succeeded at ONE, because there is one good node > left (and it's the one used as the coordinator node). In both cases, I feel > like the coordinator node doesn't count itself as a replica for the CL. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15548) Keyspace creation succeeds even though not enough nodes are up
[ https://issues.apache.org/jira/browse/CASSANDRA-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031087#comment-17031087 ] YCozy commented on CASSANDRA-15548: --- [~brandon.williams] Thanks for checking in. It will be great if we can at least have some warnings. Since we ask users to specify the RF during keyspace creation, a success without any warnings hints that users can store data up to the configured RF, which is not the case here. > Keyspace creation succeeds even though not enough nodes are up > -- > > Key: CASSANDRA-15548 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15548 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: YCozy >Priority: Normal > > When testing Cassandra with network partitions, we find that keyspace > creation can succeed without any warning even if there are not enough nodes > to support the replication factor. Here are the steps to reproduce: > # Start a cluster w/ two nodes. > # Create a keyspace with replication factor of three. > # Notice that the creation succeeds without any warning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15548) Keyspace creation succeeds even though not enough nodes are up
[ https://issues.apache.org/jira/browse/CASSANDRA-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15548: -- Summary: Keyspace creation succeeds even though not enough nodes are up (was: Keyspace creation suceeds even though not enough nodes are up) > Keyspace creation succeeds even though not enough nodes are up > -- > > Key: CASSANDRA-15548 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15548 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: YCozy >Priority: Normal > > When testing Cassandra with network partitions, we find that keyspace > creation can succeed without any warning even if there are not enough nodes > to support the replication factor. Here are the steps to reproduce: > # Start a cluster w/ two nodes. > # Create a keyspace with replication factor of three. > # Notice that the creation succeeds without any warning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15548) Keyspace creation suceeds even though not enough nodes are up
YCozy created CASSANDRA-15548: - Summary: Keyspace creation suceeds even though not enough nodes are up Key: CASSANDRA-15548 URL: https://issues.apache.org/jira/browse/CASSANDRA-15548 Project: Cassandra Issue Type: Bug Components: Cluster/Schema Reporter: YCozy When testing Cassandra with network partitions, we find that keyspace creation can succeed without any warning even if there are not enough nodes to support the replication factor. Here are the steps to reproduce: # Start a cluster w/ two nodes. # Create a keyspace with replication factor of three. # Notice that the creation succeeds without any warning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15546) Operation timeout when creating a keyspace/table.
YCozy created CASSANDRA-15546: - Summary: Operation timeout when creating a keyspace/table. Key: CASSANDRA-15546 URL: https://issues.apache.org/jira/browse/CASSANDRA-15546 Project: Cassandra Issue Type: Bug Reporter: YCozy When testing Cassandra with network partitions, we have observed the following failure from time to time: # Start a three-node cluster, say node1, node2, and node3. # Partition node3 from node1 and node2. # Use cqlsh to contact node1 to create a keyspace/table. The cqlsh and node1 runs on the same host. # Notice that cqlsh fails with the following error: {code:java} :1:OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent
[ https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15437: -- Description: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: {code:java} ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node java.lang.RuntimeException: Unable to stream hints since no live endpoints seen at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748){code} Since I didn't write any data, there is no hint to be sent. In this case, shouldn't the decommission continue? was: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: {code:java} ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node java.lang.RuntimeException: Unable to stream hints since no live endpoints seen at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748){code} Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? > Decommission fails with "Unable to stream hints since no live endpoints seen" > even if no hints need to be sent > -- > > Key: CASSANDRA-15437 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15437 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Membership >Reporter: YCozy >Priority: Normal > > Dear Cassandra developers, I was applying fault-injection to test Cassandra > and noticed the following behavior. I think this may be a bug. Please let me > know if I'm missing something. > > Step to reproduce: > # Start a two node cluster (node1 & node2) using {{ccm}}. > # Add another node to the cluster (node3). > # Partition node3 from the other two nodes. > # Try to decommission node3 using {{nodetool decommission}}. > # Notice that the decommission failed with the following error log: > > {code:java} > ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 > StorageService.java:4198 - Error while decommissioning node > java.lang.RuntimeException: Unable to stream hints since no live endpoints > seen > at > org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) > at > org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624
[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent
[ https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15437: -- Description: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: {code:java} ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node java.lang.RuntimeException: Unable to stream hints since no live endpoints seen at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748){code} Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? was: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: ``` ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node {{java.lang.RuntimeException: Unable to stream hints since no live endpoints seen}} at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) ``` Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? > Decommission fails with "Unable to stream hints since no live endpoints seen" > even if no hints need to be sent > -- > > Key: CASSANDRA-15437 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15437 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Membership >Reporter: YCozy >Priority: Normal > > Dear Cassandra developers, I was applying fault-injection to test Cassandra > and noticed the following behavior. I think this may be a bug. Please let me > know if I'm missing something. > > Step to reproduce: > # Start a two node cluster (node1 & node2) using {{ccm}}. > # Add another node to the cluster (node3). > # Partition node3 from the other two nodes. > # Try to decommission node3 using {{nodetool decommission}}. > # Notice that the decommission failed with the following error log: > > {code:java} > ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 > StorageService.java:4198 - Error while decommissioning node > java.lang.RuntimeException: Unable to stream hints since no live endpoints > seen > at > org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) > at > org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.net
[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent
[ https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YCozy updated CASSANDRA-15437: -- Description: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: ``` ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node {{java.lang.RuntimeException: Unable to stream hints since no live endpoints seen}} at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) ``` Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? was: Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: {{ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node }} {{java.lang.RuntimeException: Unable to stream hints since no live endpoints seen}} {{ at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)}} {{ at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)}} {{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}} {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}} {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}} {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}} {{ at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}} {{ at java.lang.Thread.run(Thread.java:748)}} Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? > Decommission fails with "Unable to stream hints since no live endpoints seen" > even if no hints need to be sent > -- > > Key: CASSANDRA-15437 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15437 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Membership >Reporter: YCozy >Priority: Normal > > Dear Cassandra developers, I was applying fault-injection to test Cassandra > and noticed the following behavior. I think this may be a bug. Please let me > know if I'm missing something. > > Step to reproduce: > # Start a two node cluster (node1 & node2) using {{ccm}}. > # Add another node to the cluster (node3). > # Partition node3 from the other two nodes. > # Try to decommission node3 using {{nodetool decommission}}. > # Notice that the decommission failed with the following error log: > ``` > ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 > StorageService.java:4198 - Error while decommissioning node > {{java.lang.RuntimeException: Unable to stream hints since no live endpoints > seen}} > at > org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281) > at > org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty
[jira] [Created] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent
YCozy created CASSANDRA-15437: - Summary: Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent Key: CASSANDRA-15437 URL: https://issues.apache.org/jira/browse/CASSANDRA-15437 Project: Cassandra Issue Type: Bug Components: Cluster/Membership Reporter: YCozy Dear Cassandra developers, I was applying fault-injection to test Cassandra and noticed the following behavior. I think this may be a bug. Please let me know if I'm missing something. Step to reproduce: # Start a two node cluster (node1 & node2) using {{ccm}}. # Add another node to the cluster (node3). # Partition node3 from the other two nodes. # Try to decommission node3 using {{nodetool decommission}}. # Notice that the decommission failed with the following error log: {{ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 StorageService.java:4198 - Error while decommissioning node }} {{java.lang.RuntimeException: Unable to stream hints since no live endpoints seen}} {{ at org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)}} {{ at org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)}} {{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}} {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}} {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}} {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}} {{ at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}} {{ at java.lang.Thread.run(Thread.java:748)}} Since I didn't write any data, there is no hint to be sent. In this case, should the decommission continue? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org