[ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339863#comment-16339863
 ] 

Jeff Jirsa commented on CASSANDRA-13929:
----------------------------------------

I chatted offline with Scott (who is far more familiar with the netty project 
than I am), and he noted that they've fixed a few recent bugs in that area of 
the code that COULD resolve leaks. I'm not confident this is actually a leak, 
vs just the recycler working as intended (but needing to be tuned), but perhaps 
we can consider bumping netty anyway (at least in trunk this is an easy 
decision, but it'd be interesting to find if netty 4.1.20 fixes the behavior 
you see in unpatched 3.11.1 [~tsteinmaurer] ).

 

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13929
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Thomas Steinmaurer
>            Priority: Major
>             Fix For: 3.11.x
>
>         Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
>         public void recycle()
>         {
>             if (recycleHandle != null)
>             {
>                 this.cleanup();
>                 builderRecycler.recycle(this, recycleHandle);
>                 recycleHandle = null; // ADDED
>             }
>         }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to