[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178262#comment-14178262 ]
Sylvain Lebresne commented on CASSANDRA-7886: --------------------------------------------- bq. I assume you worry about clients not being able to handle the new code Yes. bq. In my opinion any client-code that does not have a default-case should be punished. So I would not hestitate to add it Allow me to disagree. Even if drivers have a default case, they will still not know what that new exception code is about, so they will likely throw some generic "ShouldNotHappen" exception, which almost surely the client hasn't taken into account (or at not in the same way they've taken a timeout exception into account, which is what is thrown currently). There's a reason we version the protocol and it's so that clients can have the assurance that we don't change anything from under them. If we fail that, we should be the ones that should be punished. bq. I assume with CQL 4 (CASSANDRA-8043) a clean code handling and additional fields for be implemented for read_failures? Yes, and I'm saying that such handling should be part of the patch (but please don't call it "CQL 4" or you'll confuse everyone: it's just version 4 of the binary protocol, not of the language). > TombstoneOverwhelmingException should not wait for timeout > ---------------------------------------------------------- > > Key: CASSANDRA-7886 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Tested with Cassandra 2.0.8 > Reporter: Christian Spriegel > Assignee: Christian Spriegel > Priority: Minor > Fix For: 3.0 > > Attachments: 7886_v1.txt > > > *Issue* > When you have TombstoneOverwhelmingExceptions occuring in queries, this will > cause the query to be simply dropped on every data-node, but no response is > sent back to the coordinator. Instead the coordinator waits for the specified > read_request_timeout_in_ms. > On the application side this can cause memory issues, since the application > is waiting for the timeout interval for every request.Therefore, if our > application runs into TombstoneOverwhelmingExceptions, then (sooner or later) > our entire application cluster goes down :-( > *Proposed solution* > I think the data nodes should send a error message to the coordinator when > they run into a TombstoneOverwhelmingException. Then the coordinator does not > have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)