[ 
https://issues.apache.org/jira/browse/HBASE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206325#comment-13206325
 ] 

Mikhail Bautin commented on HBASE-5387:
---------------------------------------

@Lars: where are we creating a new GzipCodec frequently? We only instantiate 
ReusableStreamGzipCodec once in the following block:

{code:title=Compression.java}
    GZ("gz") {
      private transient GzipCodec codec;

      @Override
      DefaultCodec getCodec(Configuration conf) {
        if (codec == null) {
          codec = new ReusableStreamGzipCodec();
          codec.setConf(new Configuration(conf));
        }

        return codec;
      }
    },
{code}

What we are creating less frequently than before is the compressing output 
stream (a subclass of GZIPOutputStream with the associated native data 
structure): once per HFile writer with the patch vs. for every HFile block 
previously.

                
> Reuse compression streams in HFileBlock.Writer
> ----------------------------------------------
>
>                 Key: HBASE-5387
>                 URL: https://issues.apache.org/jira/browse/HBASE-5387
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.0
>            Reporter: Mikhail Bautin
>            Assignee: Mikhail Bautin
>            Priority: Critical
>             Fix For: 0.94.0
>
>         Attachments: D1719.1.patch, D1719.2.patch, 
> Fix-deflater-leak-2012-02-10_18_48_45.patch, 
> Fix-deflater-leak-2012-02-11_17_13_10.patch
>
>
> We need to to reuse compression streams in HFileBlock.Writer instead of 
> allocating them every time. The motivation is that when using Java's built-in 
> implementation of Gzip, we allocate a new GZIPOutputStream object and an 
> associated native data structure every time we create a compression stream. 
> The native data structure is only deallocated in the finalizer. This is one 
> suspected cause of recent TestHFileBlock failures on Hadoop QA: 
> https://builds.apache.org/job/HBase-TRUNK/2658/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to