[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout

Christian Spriegel (JIRA) Mon, 22 Sep 2014 07:28:44 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143247#comment-14143247
 ]


Christian Spriegel commented on CASSANDRA-7886:
-----------------------------------------------

[~kohlisankalp]: Thanks for you feedback.

[~slebresne], [~kohlisankalp]: I attached a patch for C 2.1 where I implemented 
remote failure handling for reads and range-reads.

Using a ccm 3 node cluster, I tested remote and local read failures. Both CLI 
and CQLSH return instantly, instead of waiting for timeouts.

Any feedback? Could this be merged into 2.1? Please let me know if the patch 
needs improvement.

I guess, the next steps would be to implement callbacks for writes, truncates, 
etc.

> TombstoneOverwhelmingException should not wait for timeout
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-7886
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Tested with Cassandra 2.0.8
>            Reporter: Christian Spriegel
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: 7886_v1.txt
>
>
> *Issue*
> When you have TombstoneOverwhelmingExceptions occuring in queries, this will 
> cause the query to be simply dropped on every data-node, but no response is 
> sent back to the coordinator. Instead the coordinator waits for the specified 
> read_request_timeout_in_ms.
> On the application side this can cause memory issues, since the application 
> is waiting for the timeout interval for every request.Therefore, if our 
> application runs into TombstoneOverwhelmingExceptions, then (sooner or later) 
> our entire application cluster goes down :-(
> *Proposed solution*
> I think the data nodes should send a error message to the coordinator when 
> they run into a TombstoneOverwhelmingException. Then the coordinator does not 
> have to wait for the timeout-interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout

Reply via email to