[ https://issues.apache.org/jira/browse/HBASE-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631708#comment-16631708 ]
Hudson commented on HBASE-21228: -------------------------------- Results for branch branch-1.4 [build #482 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/482/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/482//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/482//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/482//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Memory leak since AbstractFSWAL caches Thread object and never clean later > -------------------------------------------------------------------------- > > Key: HBASE-21228 > URL: https://issues.apache.org/jira/browse/HBASE-21228 > Project: HBase > Issue Type: Bug > Affects Versions: 2.1.0, 2.0.2, 1.4.7 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > Fix For: 3.0.0, 1.5.0, 1.3.3, 1.2.8, 1.4.8, 2.1.1, 2.0.3 > > Attachments: HBASE-21228.branch-2.0.001.patch, > HBASE-21228.branch-2.0.002.patch, HBASE-21228.branch-2.0.003.patch > > > In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and > SyncFutures. > {code:java} > /** > * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse > SyncFutures. > * <p> > * TODO: Reuse FSWALEntry's rather than create them anew each time as we do > SyncFutures here. > * <p> > * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers > rather than have them get > * them from this Map? > */ > private final ConcurrentMap<Thread, SyncFuture> syncFuturesByHandler; > {code} > A colleague of mine find a memory leak case caused by this map. > Every thread who writes WAL will be cached in this map, And no one will clean > the threads in the map even after the thread is dead. > In one of our customer's cluster, we noticed that even though there is no > requests, the heap of the RS is almost full and CMS GC was triggered every > second. > We dumped the heap and then found out there were more than 30 thousands > threads with Terminated state. which are all cached in this map above. > Everything referenced in these threads were leaked. Most of the threads are: > 1.PostOpenDeployTasksThread, which will write Open Region mark in WAL > 2. hconnection-0x1f838e31-shared--pool, which are used to write index short > circuit(Phoenix), and WAL will be write and sync in these threads. > 3. Index writer thread(Phoenix), which referenced by > RegionCoprocessorHost$RegionEnvironment then by HRegion and finally been > referenced by PostOpenDeployTasksThread. > We should turn this map into a thread local one, let JVM GC the terminated > thread for us. -- This message was sent by Atlassian JIRA (v7.6.3#76005)