[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout

Sylvain Lebresne (JIRA) Tue, 21 Oct 2014 03:55:52 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178262#comment-14178262
 ]


Sylvain Lebresne commented on CASSANDRA-7886:
---------------------------------------------

bq. I assume you worry about clients not being able to handle the new code

Yes.

bq. In my opinion any client-code that does not have a default-case should be 
punished. So I would not hestitate to add it

Allow me to disagree. Even if drivers have a default case, they will still not 
know what that new exception code is about, so they will likely throw some 
generic "ShouldNotHappen" exception, which almost surely the client hasn't 
taken into account (or at not in the same way they've taken a timeout exception 
into account, which is what is thrown currently). There's a reason we version 
the protocol and it's so that clients can have the assurance that we don't 
change anything from under them. If we fail that, we should be the ones that 
should be punished.

bq. I assume with CQL 4 (CASSANDRA-8043) a clean code handling and additional 
fields for be implemented for read_failures?

Yes, and I'm saying that such handling should be part of the patch (but please 
don't call it "CQL 4" or you'll confuse everyone: it's just version 4 of the 
binary protocol, not of the language).



> TombstoneOverwhelmingException should not wait for timeout
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-7886
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Tested with Cassandra 2.0.8
>            Reporter: Christian Spriegel
>            Assignee: Christian Spriegel
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: 7886_v1.txt
>
>
> *Issue*
> When you have TombstoneOverwhelmingExceptions occuring in queries, this will 
> cause the query to be simply dropped on every data-node, but no response is 
> sent back to the coordinator. Instead the coordinator waits for the specified 
> read_request_timeout_in_ms.
> On the application side this can cause memory issues, since the application 
> is waiting for the timeout interval for every request.Therefore, if our 
> application runs into TombstoneOverwhelmingExceptions, then (sooner or later) 
> our entire application cluster goes down :-(
> *Proposed solution*
> I think the data nodes should send a error message to the coordinator when 
> they run into a TombstoneOverwhelmingException. Then the coordinator does not 
> have to wait for the timeout-interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout

Reply via email to