[ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032696#comment-13032696
 ] 

Jonathan Ellis commented on CASSANDRA-2034:
-------------------------------------------

bq. there is the potential to take us back to the Bad Old Days when HH could 
cause cascading failure

To elaborate, the scenario here is, we did a write that succeeded on some 
nodes, but not others. So we need to write a local hint to replay to the 
down-or-slow nodes later. But, those nodes being down-or-slow mean load has 
increased on the rest of the cluster, and writing the extra hint will increase 
that further, possibly enough that other nodes will see this coordinator as 
down-or-slow, too, and so on.

So I think what we want to do, with this option on, is to attempt the hint 
write but if we can't do it in a reasonable time, throw back a 
TimedOutException which is already our signal that "your cluster may be 
overloaded, you need to back off."

Specifically, we could add a separate executor here, with a blocking, capped 
queue. When we go to do a hint-after-failure we enqueue the write but if it is 
rejected because queue is full we throw the TOE. Otherwise, we wait for the 
write and then return success to the client.

The tricky part is the queue needs to be large enough to handle load spikes but 
small enough that wait-for-success-post-enqueue is negligible compared to 
RpcTimeout. If we had different timeouts for writes than reads (which we don't 
-- CASSANDRA-959) then it might be nice to use say 80% of the timeout for the 
normal write, and reserve 20% for the hint phase.

> Make Read Repair unnecessary when Hinted Handoff is enabled
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-2034
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling 
> HH means RR/AES will have less work to do, but you can't disable RR entirely 
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the 
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
> check the responseHandler write acks and write local hints for any missing 
> targets.
> This would making disabling RR when HH is enabled a much more reasonable 
> option, which has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to