[jira] [Commented] (CASSANDRA-8352) Timeout Exception on Node Failure in Remote Data Center

2014-12-09 Thread Amit Singh Chowdhery (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239271#comment-14239271
 ] 

Amit Singh Chowdhery commented on CASSANDRA-8352:
-

We have upgraded to Cassandra 2.0.11 and yet are facing the same trouble.
Gist:-
We have two 3 node clusters in two different DCs and if one or more of the 
nodes go down in one Data Center , ~5-10% traffic failure is observed on the 
other.
CL: LOCAL_QUORUM
RF=3


> Timeout Exception on Node Failure in Remote Data Center
> ---
>
> Key: CASSANDRA-8352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8352
> Project: Cassandra
>  Issue Type: Bug
> Environment: Unix, Cassandra 2.0.3
>Reporter: Akhtar Hussain
>  Labels: DataCenter, GEO-Red
>
> We have a Geo-red setup with 2 Data centers having 3 nodes each. When we 
> bring down a single Cassandra node down in DC2 by kill -9 , 
> reads fail on DC1 with TimedOutException for a brief amount of time (15-20 
> sec~). 
> Questions:
> 1.We need to understand why reads fail on DC1 when a node in another DC 
> i.e. DC2 fails? As we are using LOCAL_QUORUM for both reads/writes in DC1, 
> request should return once 2 nodes in local DC have replied instead of timing 
> out because of node in remote DC.
> 2.We want to make sure that no Cassandra requests fail in case of node 
> failures. We used rapid read protection of ALWAYS/99percentile/10ms as 
> mentioned in 
> http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2. 
> But nothing worked. How to ensure zero request failures in case a node fails?
> 3.What is the right way of handling HTimedOutException exceptions in 
> Hector?
> 4.Please confirm are we using public private hostnames as expected?
> We are using Cassandra 2.0.3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8352) Timeout Exception on Node Failure in Remote Data Center

2014-11-25 Thread Akhtar Hussain (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225828#comment-14225828
 ] 

Akhtar Hussain commented on CASSANDRA-8352:
---

Fine we will test it on Cassandra version 2.0.11 and will share the results 
soon. :)

> Timeout Exception on Node Failure in Remote Data Center
> ---
>
> Key: CASSANDRA-8352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8352
> Project: Cassandra
>  Issue Type: Bug
> Environment: Unix, Cassandra 2.0.3
>Reporter: Akhtar Hussain
>  Labels: DataCenter, GEO-Red
>
> We have a Geo-red setup with 2 Data centers having 3 nodes each. When we 
> bring down a single Cassandra node down in DC2 by kill -9 , 
> reads fail on DC1 with TimedOutException for a brief amount of time (15-20 
> sec~). 
> Questions:
> 1.We need to understand why reads fail on DC1 when a node in another DC 
> i.e. DC2 fails? As we are using LOCAL_QUORUM for both reads/writes in DC1, 
> request should return once 2 nodes in local DC have replied instead of timing 
> out because of node in remote DC.
> 2.We want to make sure that no Cassandra requests fail in case of node 
> failures. We used rapid read protection of ALWAYS/99percentile/10ms as 
> mentioned in 
> http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2. 
> But nothing worked. How to ensure zero request failures in case a node fails?
> 3.What is the right way of handling HTimedOutException exceptions in 
> Hector?
> 4.Please confirm are we using public private hostnames as expected?
> We are using Cassandra 2.0.3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8352) Timeout Exception on Node Failure in Remote Data Center

2014-11-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225819#comment-14225819
 ] 

Jonathan Ellis commented on CASSANDRA-8352:
---

Here's how this works:

You test the new version to make sure it's something we haven't fixed already.  
Then we write a fix for the next new version.

Please don't reopen until you've done that.

> Timeout Exception on Node Failure in Remote Data Center
> ---
>
> Key: CASSANDRA-8352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8352
> Project: Cassandra
>  Issue Type: Bug
> Environment: Unix, Cassandra 2.0.3
>Reporter: Akhtar Hussain
>  Labels: DataCenter, GEO-Red
>
> We have a Geo-red setup with 2 Data centers having 3 nodes each. When we 
> bring down a single Cassandra node down in DC2 by kill -9 , 
> reads fail on DC1 with TimedOutException for a brief amount of time (15-20 
> sec~). 
> Questions:
> 1.We need to understand why reads fail on DC1 when a node in another DC 
> i.e. DC2 fails? As we are using LOCAL_QUORUM for both reads/writes in DC1, 
> request should return once 2 nodes in local DC have replied instead of timing 
> out because of node in remote DC.
> 2.We want to make sure that no Cassandra requests fail in case of node 
> failures. We used rapid read protection of ALWAYS/99percentile/10ms as 
> mentioned in 
> http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2. 
> But nothing worked. How to ensure zero request failures in case a node fails?
> 3.What is the right way of handling HTimedOutException exceptions in 
> Hector?
> 4.Please confirm are we using public private hostnames as expected?
> We are using Cassandra 2.0.3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)