[ 
https://issues.apache.org/jira/browse/CASSANDRA-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217196#comment-13217196
 ] 

Pavel Yaskevich commented on CASSANDRA-3294:
--------------------------------------------

How about we assign probability "to be alive" to each of the nodes in the ring 
(starting from uniform distribution) and with each of the failures e.g. 
RPC/Gossiper communication error we would decrease probability of node being 
alive by constant factor and increase by other constant factor if communication 
was successful. That would allow us to calculate the endpoint with the highest 
alive (and all other sorted) probability for sub-group of 
SS.getLiveNaturalEndpoints(String, RingPosition), what do you think? 
                
> a node whose TCP connection is not up should be considered down for the 
> purpose of reads and writes
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3294
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3294
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> Cassandra fails to handle the most simple of cases intelligently - a process 
> gets killed and the TCP connection dies. I cannot see a good reason to wait 
> for a bunch of RPC timeouts and thousands of hung requests to realize that we 
> shouldn't be sending messages to a node when the only possible means of 
> communication is confirmed down. This is why one has to "disablegossip and 
> wait for a while" to restar a node on a busy cluster (especially without 
> CASSANDRA-2540 but that only helps under certain circumstances).
> A more generalized approach where by one e.g. weights in the number of 
> currently outstanding RPC requests to a node, would likely take care of this 
> case as well. But until such a thing exists and works well, it seems prudent 
> to have the very common and controlled form of "failure" be handled better.
> Are there difficulties I'm not seeing?
> I can see that one may want to distinguish between considering something 
> "really down" (and e.g. fail a repair because it's down) from what I'm 
> talking about, so maybe there are different concepts (say one is "currently 
> unreachable" rather than "down") being conflated. But in the specific case of 
> sending reads/writes to a node we *know* we cannot talk to, it seems 
> unnecessarily detrimental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to