[ https://issues.apache.org/jira/browse/HBASE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206325#comment-13206325 ]
Mikhail Bautin commented on HBASE-5387: --------------------------------------- @Lars: where are we creating a new GzipCodec frequently? We only instantiate ReusableStreamGzipCodec once in the following block: {code:title=Compression.java} GZ("gz") { private transient GzipCodec codec; @Override DefaultCodec getCodec(Configuration conf) { if (codec == null) { codec = new ReusableStreamGzipCodec(); codec.setConf(new Configuration(conf)); } return codec; } }, {code} What we are creating less frequently than before is the compressing output stream (a subclass of GZIPOutputStream with the associated native data structure): once per HFile writer with the patch vs. for every HFile block previously. > Reuse compression streams in HFileBlock.Writer > ---------------------------------------------- > > Key: HBASE-5387 > URL: https://issues.apache.org/jira/browse/HBASE-5387 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.0 > Reporter: Mikhail Bautin > Assignee: Mikhail Bautin > Priority: Critical > Fix For: 0.94.0 > > Attachments: D1719.1.patch, D1719.2.patch, > Fix-deflater-leak-2012-02-10_18_48_45.patch, > Fix-deflater-leak-2012-02-11_17_13_10.patch > > > We need to to reuse compression streams in HFileBlock.Writer instead of > allocating them every time. The motivation is that when using Java's built-in > implementation of Gzip, we allocate a new GZIPOutputStream object and an > associated native data structure every time we create a compression stream. > The native data structure is only deallocated in the finalizer. This is one > suspected cause of recent TestHFileBlock failures on Hadoop QA: > https://builds.apache.org/job/HBase-TRUNK/2658/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira