[ 
https://issues.apache.org/jira/browse/CASSANDRA-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-16616:
---------------------------------------
    Reviewers: Benjamin Lerer

> Harden internode message resource limit accounting against serialization 
> failures
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16616
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16616
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Messaging/Internode
>            Reporter: Jon Meredith
>            Assignee: Jon Meredith
>            Priority: Normal
>             Fix For: 4.0-rc
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the internode messaging exception recovery code fails and is unable to 
> correctly adjust the resource limits for an OutboundConnection, it affects 
> the other connection types sharing the same OutboundConnections so that any 
> of the connections could hit {{assert using >= 0;}} in
> {{org.apache.cassandra.net.ResourceLimits.Concurrent#release}}.
> While it is possible to modify all of the outbound connection code to 
> re-initialize all of the connections with a correct limit, the effort to test 
> and maintain the recovery code seems too high for something that should 
> "never happen" (except it did once, which is why it needs hardening).  The 
> safer option is to kill the JVM and have whatever external monitoring is in 
> place restart the instance in a known good state.
> Additionally, the logging for dropping outbound messages that have expired or 
> are unserializable messages takes place after the recovery handling logic. If 
> there are problems with the recovery logic that throw an exception, the 
> message is never logged for future diagnosis. Logging should take place 
> first, and then releasing capacity/handling the expiration/serialization.
> Discovered on a branch modified for testing that threw an exception in the 
> Verb.serializeSize method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to