[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261460#comment-14261460 ]
Tyler Hobbs commented on CASSANDRA-7886: ---------------------------------------- bq. Regarding TOE: Currently I throw TOEs as exceptions and they get logged just like any other exception. I am not sure if this is desireable and would like to hear your feedback. I think we have the following options: bq. Leave as it is in v5, meaning TOEs get logged with stacktraces. Hmm, I forgot that with the previous setup, we wouldn't have stacktraces logged for TOEs under normal circumstances. bq. Add catch blocks where neccessary and log it in user-friendly way. But it might be in many places. Also in this case I would prefer making TOE a checked exception. Imho TOE should not be unchecked. I believe TOEs should remain unchecked. They are closer in nature to an IOError than something that calling methods should explicitly account for. They would also add a lot of noise to the entire read path. bq. Add TOE logging to C* default exception handler. (I did not investigate yet, but I assume there is a exceptionhandler) We do have an unhandled exception handler (in {{CassandraDaemon}}), but I'm not sure that's the best solution either. It might be okay to suppress stacktraces for TOEs on the normal read path, but in unexpected cases (like, say, dealing with hints or other system tables internally) we would want to see the stacktrace. Unfortunately we can't reliably distinguish the two at this level. bq. Leave it as it was before I think it's a toss-up between this (catching TOEs in a few places and suppressing) and always allowing stacktraces to be logged for TombstoneOverwhelmingExceptions. > Coordinator should not wait for read timeouts when replicas hit Exceptions > -------------------------------------------------------------------------- > > Key: CASSANDRA-7886 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Tested with Cassandra 2.0.8 > Reporter: Christian Spriegel > Assignee: Christian Spriegel > Priority: Minor > Labels: protocolv4 > Fix For: 3.0 > > Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, > 7886_v4_trunk.txt, 7886_v5_trunk.txt > > > *Issue* > When you have TombstoneOverwhelmingExceptions occuring in queries, this will > cause the query to be simply dropped on every data-node, but no response is > sent back to the coordinator. Instead the coordinator waits for the specified > read_request_timeout_in_ms. > On the application side this can cause memory issues, since the application > is waiting for the timeout interval for every request.Therefore, if our > application runs into TombstoneOverwhelmingExceptions, then (sooner or later) > our entire application cluster goes down :-( > *Proposed solution* > I think the data nodes should send a error message to the coordinator when > they run into a TombstoneOverwhelmingException. Then the coordinator does not > have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)