[ https://issues.apache.org/jira/browse/HADOOP-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Chen reassigned HADOOP-14523: ---------------------------------- Assignee: Misha Dmitriev > OpensslAesCtrCryptoCodec.finalize() holds excessive amounts of memory > --------------------------------------------------------------------- > > Key: HADOOP-14523 > URL: https://issues.apache.org/jira/browse/HADOOP-14523 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Misha Dmitriev > Assignee: Misha Dmitriev > > I recently analyzed JVM heap dumps from Hive running a big workload. Two > excerpts from the analysis done with jxray (www.jxray.com) are given below. > It turns out that nearly a half of live memory is taken by objects awaiting > finalization, and the biggest offender among them is class > OpensslAesCtrCryptoCodec: > {code} > 401,189K (39.7%) (1 of sun.misc.Cleaner) > <-- Java Static: sun.misc.Cleaner.first > 400,572K (39.6%) (14001 of > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, > org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager, java.util.jar.JarFile etc.) > <-- j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <-- > sun.misc.Cleaner.next <-- sun.misc.Cleaner.{next} <-- Java Static: > sun.misc.Cleaner.first > 270,673K (26.8%) (2138 of org.apache.hadoop.mapred.JobConf) > <-- org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec.conf <-- > j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <-- sun.misc.Cleaner.next > <-- sun.misc.Cleaner.{next} <-- Java Static: sun.misc.Cleaner.first > --------------------- > 102,232K (10.1%) (1 of j.l.r.Finalizer) > <-- Java Static: java.lang.ref.Finalizer.unfinalized > 101,676K (10.1%) (8613 of > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec, > java.util.zip.ZipFile$ZipFileInflaterInputStream, > org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager etc.) > <-- j.l.r.Finalizer.referent <-- j.l.r.Finalizer.{next} <-- Java Static: > java.lang.ref.Finalizer.unfinalized > {code} > This heap dump was taken using 'jmap -dump:live', which forces the JVM to run > full GC before dumping the heap. So we are already looking at the heap right > after GC, and yet all these unfinalized objects are there. I think this > happens because the JVM always runs only one finalization thread, and thus > the queue of objects that need finalization may get processed too slowly. My > understanding is that finalization works as follows: > 1. When GC runs, it discovers that object x that overrides finalize() is > unreachable. > 2. x is added to the finalization queue. So technically x is still reachable, > it occupies memory, and _all the objects that it references stay in memory as > well_. > 3. The finalization thread processes objects from the finalization queue > serially, thus x may stay in memory for long time. > 4. x.finalize() is invoked, then x is made unreachable. If x stayed in memory > for long time, it's now in Old Gen of the heap, so only full GC can clean it > up. > 5. When full GC finally occurs, x gets cleaned up. > So finalization is formally reliable, but in practice it's quite possible > that a lot of unreachable, but unfinalized objects flood the memory. I guess > we are seeing all these OpensslAesCtrCryptoCodec objects when they are in > phase 3 above. And the really bad thing is that these objects in turn keep in > memory a whole lot of other stuff, in particular JobConf objects. Such a > JobConf has nothing to do with finalization, yet the GC cannot release it > until the corresponding OpensslAesCtrCryptoCodec's is gone. > Here is OpensslAesCtrCryptoCodec.finalize() method with my comments: > {code} > protected void finalize() throws Throwable { > try { > Closeable r = (Closeable) this.random; > r.close(); // Relevant only when (random instanceof OsSecureRandom == > true) > } catch (ClassCastException e) { > } > super.finalize(); // Not needed, no finalize() in superclasses > } > {code} > So, finalize() in this class, that may keep in memory a whole tree of > objects, is relevant only when this codec is configured to use OsSecureRandom > class. The latter reads random bytes from the configured file, and needs > finalization to close the input stream associated with that file. > The suggested fix is to remove finalize() from OpensslAesCtrCryptoCodec and > add it to the only class from this "family" that really needs it, > OsSecureRandom. That will ensure that only OsSecureRandom objects (if/when > they are used) stay in memory awaiting finalization, and no other, irrelevant > objects. > Note that this solution means that streams are still closed lazily. This, in > principle, may cause its own problems. So the most reliable fix would be to > call OsSecureRandom.close() explicitly when it's not needed anymore. But the > above fix is a necessary first step anyway, it will remove the most acute > problem with memory and will not make any other things worse than they > currently are. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org