[jira] [Commented] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down

2020-05-23 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114919#comment-17114919
 ] 

YCozy commented on CASSANDRA-15795:
---

[~jasonstack] Sorry, I checked my cqlsh history, and found that my 
replication_factor is 2.
This time I use your commands, except changing replication_factor to 2.
In my cluster, node 1 and node 3 store the data replica.
After stopping node 2 and node 3, I get NoHostAvailable when select * from 
ks.cf on node 1.

> Cannot read data from a 3-node cluster which has two nodes down
> ---
>
> Key: CASSANDRA-15795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15795
> Project: Cassandra
>  Issue Type: Bug
>Reporter: YCozy
>Priority: Normal
>
> I start up a 3 nodes cluster, and write a row with 'replication_factor' : 
> '2'. The consistency level is ONE.
> Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
> But cqlsh returns NoHostAvailable.
> I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down

2020-05-23 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15795:
--
Description: 
I start up a 3 nodes cluster, and write a row with 'replication_factor' : '2'. 
The consistency level is ONE.
Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
But cqlsh returns NoHostAvailable.
I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.

  was:
I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. 
The consistency level is ONE.
Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
But cqlsh returns NoHostAvailable.
I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.


> Cannot read data from a 3-node cluster which has two nodes down
> ---
>
> Key: CASSANDRA-15795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15795
> Project: Cassandra
>  Issue Type: Bug
>Reporter: YCozy
>Priority: Normal
>
> I start up a 3 nodes cluster, and write a row with 'replication_factor' : 
> '2'. The consistency level is ONE.
> Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
> But cqlsh returns NoHostAvailable.
> I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down

2020-05-08 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15795:
--
Description: 
I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. 
The consistency level is ONE.
Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
But cqlsh returns NoHostAvailable.
I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.

  was:
I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. 
The consistency level is ONE.
Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
But cqlsh returns NoHostAvailable.
My Cassandra version is 3.11.6.


> Cannot read data from a 3-node cluster which has two nodes down
> ---
>
> Key: CASSANDRA-15795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15795
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip, Consistency/Coordination
>Reporter: YCozy
>Priority: Normal
>
> I start up a 3 nodes cluster, and write a row with 'replication_factor' : 
> '3'. The consistency level is ONE.
> Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
> But cqlsh returns NoHostAvailable.
> I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down

2020-05-07 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101560#comment-17101560
 ] 

YCozy commented on CASSANDRA-15795:
---

Kishan reported a similar issue. The test in 11804 is flaky, while I can always 
successfully reproduce this issue.
And Kishan said 'I was able to repro on C* 3.5 but not in C* 3.6', but I find 
this bug in 3.11.6. 
So maybe it's a new issue? 

> Cannot read data from a 3-node cluster which has two nodes down
> ---
>
> Key: CASSANDRA-15795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15795
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip, Consistency/Coordination
>Reporter: YCozy
>Priority: Normal
>
> I start up a 3 nodes cluster, and write a row with 'replication_factor' : 
> '3'. The consistency level is ONE.
> Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
> But cqlsh returns NoHostAvailable.
> My Cassandra version is 3.11.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down

2020-05-06 Thread YCozy (Jira)
YCozy created CASSANDRA-15795:
-

 Summary: Cannot read data from a 3-node cluster which has two 
nodes down
 Key: CASSANDRA-15795
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15795
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Gossip, Consistency/Coordination
Reporter: YCozy


I start up a 3 nodes cluster, and write a row with 'replication_factor' : '3'. 
The consistency level is ONE.
Then I kill two nodes, and try to get the row that I just inserted by cqlsh.
But cqlsh returns NoHostAvailable.
My Cassandra version is 3.11.6.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

2020-04-25 Thread YCozy (Jira)
YCozy created CASSANDRA-15758:
-

 Summary: ERROR when a disconnected Cassandra node comes back and 
receives a drop/add column request
 Key: CASSANDRA-15758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15758
 Project: Cassandra
  Issue Type: Bug
Reporter: YCozy


We got the following error when we were dropping a column in the table:
{code:java}
ERROR [MigrationStage:1] 2020-04-24 00:07:54,995 SchemaKeyspace.java:1021 - No 
partition columns found for table ks_name.tbl_name in system_schema.columns.  
This may be due to corruption or concurrent dropping and altering of a table. 
If this table is supposed to be dropped, restart cassandra with 
-Dcassandra.ignore_corrupted_schema_tables=true and run the following query to 
cleanup: "DELETE FROM system_schema.tables WHERE keyspace_name = 'ks_name' AND 
table_name = 'tbl_name'; DELETE FROM system_schema.columns WHERE keyspace_name 
= 'ks_name' AND table_name = 'tbl_name';" If the table is not supposed to be 
dropped, restore system_schema.columns sstables from backups.
ERROR [MigrationStage:1] 2020-04-25 15:21:55,716 CassandraDaemon.java:228 - 
Exception in thread Thread[MigrationStage:1,5,main]
org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: Columns not found in 
schema table for ks_name.tbl_name
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1100)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046) 
~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesOnly(SchemaKeyspace.java:951)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1401)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380)
 ~[main/:na]
        at 
org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51)
 ~[main/:na]
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_242]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[na:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_242]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
 [main/:na]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242]
{code}
We analyzed the logs and came up with the following theory of what happened:
 # We have a cluster of three nodes (C1, C2, C3).
 # Right after we start all the nodes, C3 is partitioned away from the other. 
As a result, neither C1 or C2 knows that C3 exists.
 # User contacts C1 to create a keyspace "ks_name" and a table "tbl_name". C1 
and C2 serve the requests. Since they don't know about C3, they think the 
schema is consistent across the cluster. Both the keyspace and the table are 
created successfully without warning.
 # User tries to drop a column in the table. Now C3 reconnects and receives the 
drop column request from C1 (the coordinator node). However, it does not know 
about "ks_name" nor "tbl_name". So it throws the above error.
 # If the user tries to add a column instead of dropping one, the same error 
will occur.

Since network partition is inevitable in deployed clusters, we think Cassandra 
should better handle such a scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11804) Read request at proper CL returns ReadTimeout occasionally

2020-03-25 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066764#comment-17066764
 ] 

YCozy commented on CASSANDRA-11804:
---

We saw a similar issue in 3.11.5 when querying data using cqlsh. We had a three 
node cluster, and got the following error when two nodes are partitioned away:
{code:java}
:1:ReadTimeout: Error from server: code=1200 [Coordinator node timed out 
waiting for replica nodes' responses] message="Operation timed out - received 
only 0 responses." info={'received_responses': 0, 'required_responses': 1, 
'consistency': 'ONE'}{code}

> Read request at proper CL returns ReadTimeout occasionally
> --
>
> Key: CASSANDRA-11804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11804
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 3.4 | Ruby-driver 3.0-rc2
>Reporter: Kishan Karunaratne
>Priority: Normal
> Fix For: 3.11.x
>
>
> I have a 3-node cluster with a keyspace with RF=3, with some data inserted. 
> I'm using a DowngradingConsistency retry policy on the client. Performing a 
> query at CL=ALL with one node blocked (ccm pause/SIGSTOP), the query fails 
> and returns a ReadTimeout to the client as expected. The following is in the 
> debug log:
> {noformat}
> ReadCallback.java:126 - Timed out; received 2 of 3 responses (including data)
> {noformat}
> Now, the driver automatically retries once, and in this case at QUORUM.  The 
> client receives a ReadTimeout once more: 
> "Cassandra::Errors::ReadTimeoutError: Operation timed out - received only 1 
> responses." This ReadTimeout should not have happened because we still have 2 
> good nodes/hosts to retrieve data from. This is in the debug log:
> {noformat}
> ReadCallback.java:126 - Timed out; received 1 of 2 responses (including data)
> {noformat}
> The weird part is that the query does occasionally succeed (at QUORUM), I'd 
> say < 50% of the time. I did the same experiment with two nodes blocked and I 
> get "Cassandra::Errors::ReadTimeoutError: Operation timed out - received only 
> 0 responses.":
> {noformat}
> ReadCallback.java:126 - Timed out; received 1 of 3 responses (including data)
> ReadCallback.java:126 - Timed out; received 0 of 1 responses
> {noformat}
> I expect this query to have succeeded at ONE, because there is one good node 
> left (and it's the one used as the coordinator node). In both cases, I feel 
> like the coordinator node doesn't count itself as a replica for the CL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15548) Keyspace creation succeeds even though not enough nodes are up

2020-02-05 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031087#comment-17031087
 ] 

YCozy commented on CASSANDRA-15548:
---

[~brandon.williams] Thanks for checking in.

It will be great if we can at least have some warnings. Since we ask users to 
specify the RF during keyspace creation, a success without any warnings hints 
that users can store data up to the configured RF, which is not the case here.

 

> Keyspace creation succeeds even though not enough nodes are up
> --
>
> Key: CASSANDRA-15548
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15548
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: YCozy
>Priority: Normal
>
> When testing Cassandra with network partitions, we find that keyspace 
> creation can succeed without any warning even if there are not enough nodes 
> to support the replication factor. Here are the steps to reproduce:
>  # Start a cluster w/ two nodes.
>  # Create a keyspace with replication factor of three.
>  # Notice that the creation succeeds without any warning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15548) Keyspace creation succeeds even though not enough nodes are up

2020-02-05 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15548:
--
Summary: Keyspace creation succeeds even though not enough nodes are up  
(was: Keyspace creation suceeds even though not enough nodes are up)

> Keyspace creation succeeds even though not enough nodes are up
> --
>
> Key: CASSANDRA-15548
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15548
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: YCozy
>Priority: Normal
>
> When testing Cassandra with network partitions, we find that keyspace 
> creation can succeed without any warning even if there are not enough nodes 
> to support the replication factor. Here are the steps to reproduce:
>  # Start a cluster w/ two nodes.
>  # Create a keyspace with replication factor of three.
>  # Notice that the creation succeeds without any warning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15548) Keyspace creation suceeds even though not enough nodes are up

2020-02-05 Thread YCozy (Jira)
YCozy created CASSANDRA-15548:
-

 Summary: Keyspace creation suceeds even though not enough nodes 
are up
 Key: CASSANDRA-15548
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15548
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Schema
Reporter: YCozy


When testing Cassandra with network partitions, we find that keyspace creation 
can succeed without any warning even if there are not enough nodes to support 
the replication factor. Here are the steps to reproduce:
 # Start a cluster w/ two nodes.
 # Create a keyspace with replication factor of three.
 # Notice that the creation succeeds without any warning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15546) Operation timeout when creating a keyspace/table.

2020-02-04 Thread YCozy (Jira)
YCozy created CASSANDRA-15546:
-

 Summary: Operation timeout when creating a keyspace/table.
 Key: CASSANDRA-15546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15546
 Project: Cassandra
  Issue Type: Bug
Reporter: YCozy


When testing Cassandra with network partitions, we have observed the following 
failure from time to time:
 # Start a three-node cluster, say node1, node2, and node3.
 # Partition node3 from node1 and node2.
 # Use cqlsh to contact node1 to create a keyspace/table. The cqlsh and node1 
runs on the same host.
 # Notice that cqlsh fails with the following error:

{code:java}
:1:OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See 
Session.execute[_async](timeout)'}, last_host=127.0.0.1{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent

2019-11-25 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15437:
--
Description: 
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

 
{code:java}
ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node 
 java.lang.RuntimeException: Unable to stream hints since no live endpoints seen
  at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
  at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  at java.lang.Thread.run(Thread.java:748){code}
 

Since I didn't write any data, there is no hint to be sent. In this case, 
shouldn't the decommission continue?

  was:
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

 
{code:java}
ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node 
 java.lang.RuntimeException: Unable to stream hints since no live endpoints seen
  at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
  at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  at java.lang.Thread.run(Thread.java:748){code}
 

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?


> Decommission fails with "Unable to stream hints since no live endpoints seen" 
> even if no hints need to be sent
> --
>
> Key: CASSANDRA-15437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15437
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: YCozy
>Priority: Normal
>
> Dear Cassandra developers, I was applying fault-injection to test Cassandra 
> and noticed the following behavior. I think this may be a bug. Please let me 
> know if I'm missing something.
>  
> Step to reproduce:
>  # Start a two node cluster (node1 & node2) using {{ccm}}.
>  # Add another node to the cluster (node3).
>  # Partition node3 from the other two nodes.
>  # Try to decommission node3 using {{nodetool decommission}}.
>  # Notice that the decommission failed with the following error log:
>  
> {code:java}
> ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
> StorageService.java:4198 - Error while decommissioning node 
>  java.lang.RuntimeException: Unable to stream hints since no live endpoints 
> seen
>   at 
> org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624

[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent

2019-11-25 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15437:
--
Description: 
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

 
{code:java}
ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node 
 java.lang.RuntimeException: Unable to stream hints since no live endpoints seen
  at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
  at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
  at java.lang.Thread.run(Thread.java:748){code}
 

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?

  was:
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

```

ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node 
 {{java.lang.RuntimeException: Unable to stream hints since no live endpoints 
seen}}
 at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 at java.lang.Thread.run(Thread.java:748)

```

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?


> Decommission fails with "Unable to stream hints since no live endpoints seen" 
> even if no hints need to be sent
> --
>
> Key: CASSANDRA-15437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15437
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: YCozy
>Priority: Normal
>
> Dear Cassandra developers, I was applying fault-injection to test Cassandra 
> and noticed the following behavior. I think this may be a bug. Please let me 
> know if I'm missing something.
>  
> Step to reproduce:
>  # Start a two node cluster (node1 & node2) using {{ccm}}.
>  # Add another node to the cluster (node3).
>  # Partition node3 from the other two nodes.
>  # Try to decommission node3 using {{nodetool decommission}}.
>  # Notice that the decommission failed with the following error log:
>  
> {code:java}
> ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
> StorageService.java:4198 - Error while decommissioning node 
>  java.lang.RuntimeException: Unable to stream hints since no live endpoints 
> seen
>   at 
> org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at 
> io.net

[jira] [Updated] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent

2019-11-25 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated CASSANDRA-15437:
--
Description: 
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

```

ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node 
 {{java.lang.RuntimeException: Unable to stream hints since no live endpoints 
seen}}
 at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
 at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 at java.lang.Thread.run(Thread.java:748)

```

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?

  was:
Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

{{ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node }}
{{java.lang.RuntimeException: Unable to stream hints since no live endpoints 
seen}}
{{ at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)}}
{{ at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)}}
{{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
{{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
{{ at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
{{ at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
{{ at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}}
{{ at java.lang.Thread.run(Thread.java:748)}}

 

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?


> Decommission fails with "Unable to stream hints since no live endpoints seen" 
> even if no hints need to be sent
> --
>
> Key: CASSANDRA-15437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15437
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership
>Reporter: YCozy
>Priority: Normal
>
> Dear Cassandra developers, I was applying fault-injection to test Cassandra 
> and noticed the following behavior. I think this may be a bug. Please let me 
> know if I'm missing something.
>  
> Step to reproduce:
>  # Start a two node cluster (node1 & node2) using {{ccm}}.
>  # Add another node to the cluster (node3).
>  # Partition node3 from the other two nodes.
>  # Try to decommission node3 using {{nodetool decommission}}.
>  # Notice that the decommission failed with the following error log:
> ```
> ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
> StorageService.java:4198 - Error while decommissioning node 
>  {{java.lang.RuntimeException: Unable to stream hints since no live endpoints 
> seen}}
>  at 
> org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)
>  at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at 
> io.netty

[jira] [Created] (CASSANDRA-15437) Decommission fails with "Unable to stream hints since no live endpoints seen" even if no hints need to be sent

2019-11-25 Thread YCozy (Jira)
YCozy created CASSANDRA-15437:
-

 Summary: Decommission fails with "Unable to stream hints since no 
live endpoints seen" even if no hints need to be sent
 Key: CASSANDRA-15437
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15437
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Membership
Reporter: YCozy


Dear Cassandra developers, I was applying fault-injection to test Cassandra and 
noticed the following behavior. I think this may be a bug. Please let me know 
if I'm missing something.

 

Step to reproduce:
 # Start a two node cluster (node1 & node2) using {{ccm}}.
 # Add another node to the cluster (node3).
 # Partition node3 from the other two nodes.
 # Try to decommission node3 using {{nodetool decommission}}.
 # Notice that the decommission failed with the following error log:

{{ERROR [RMI TCP Connection(4)-127.0.0.1] 2019-11-25 22:45:27,716 
StorageService.java:4198 - Error while decommissioning node }}
{{java.lang.RuntimeException: Unable to stream hints since no live endpoints 
seen}}
{{ at 
org.apache.cassandra.service.StorageService.getPreferredHintsStreamTarget(StorageService.java:4281)}}
{{ at 
org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:156)}}
{{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
{{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
{{ at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
{{ at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
{{ at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}}
{{ at java.lang.Thread.run(Thread.java:748)}}

 

Since I didn't write any data, there is no hint to be sent. In this case, 
should the decommission continue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org