[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246592#comment-14246592 ]
Christian Spriegel commented on CASSANDRA-7886: ----------------------------------------------- Hi [~thobbs], sorry I kept you waiting for so long. {quote}Instead of using Unavailable when the protocol version is less than 4, use ReadTimeout. Unavailable signals that some of the replicas are considered to be down, which is not the case here. Plus, ReadTimeout is the error that is currently returned in these circumstances.{quote} Makes sense. I changed Unavailable to ReadTimeout for CQL3 and Thrift. {quote}In ErrorMessage.encodedSize(), there's some commented out code for READ_FAILURE handling.{quote} The commented code was meant as a preparation for WriteFailureExceptions. Does it perhaps make sense to fully add WriteFailureException? As a follow up ticket, we could implement it then for the different writes. Or do you want me to get rid it? {quote}Instead of catching and ignoring TombstoneOverwhelmingException in multiple places, I suggest you move the logged error message into the TOE message and let it propagate (and be logged) like any other exception.{quote} Just to make sure that we dont touch anything new here: TOEs are logged inside SliceQueryFilter.collectReducedColumns already. I simply took this catch block from the ReadVerbHandler/RangeSliceVerbHandler and put into StorageProxy/MessageDeliveryTask. I don't like that either, but I did not want to touch it. Do you still want me to change it? {quote}Can you update docs/native_protocol_v4.spec with these changes? You can look at the previous specs to see examples of the "changes from the previous version" section{quote} Ok. Should we also add WriteFailures? {quote}In StorageProxy, the unavailables counter should not be incremented for read failures. I suggest creating a new, separate failure counter.{quote} Done. {quote}Also in StorageProxy, there's now quite a bit of code duplication around building error messages for ReadTimeoutExceptions and ReadFailureExceptions. Can you condense those somewhat?{quote} I merged ReadTimeoutException|ReadFailureException into a single catch block. I also added the last cell-name to the TOE, so that an administrator can get an estimate where to look for the tombstones. This doesn't really match the tickets new name, but is related to my original issue :-) Overall, one question remains from my side: Should I also prepare WriteFailureExceptions? I could (as a follow-up ticket) add these to the write-codepath. > Coordinator should not wait for read timeouts when replicas hit Exceptions > -------------------------------------------------------------------------- > > Key: CASSANDRA-7886 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Tested with Cassandra 2.0.8 > Reporter: Christian Spriegel > Assignee: Christian Spriegel > Priority: Minor > Labels: protocolv4 > Fix For: 3.0 > > Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt > > > *Issue* > When you have TombstoneOverwhelmingExceptions occuring in queries, this will > cause the query to be simply dropped on every data-node, but no response is > sent back to the coordinator. Instead the coordinator waits for the specified > read_request_timeout_in_ms. > On the application side this can cause memory issues, since the application > is waiting for the timeout interval for every request.Therefore, if our > application runs into TombstoneOverwhelmingExceptions, then (sooner or later) > our entire application cluster goes down :-( > *Proposed solution* > I think the data nodes should send a error message to the coordinator when > they run into a TombstoneOverwhelmingException. Then the coordinator does not > have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)