wangchao created HADOOP-12619:
---------------------------------

             Summary: Native memory leaks in CompressorStream
                 Key: HADOOP-12619
                 URL: https://issues.apache.org/jira/browse/HADOOP-12619
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.4.0
            Reporter: wangchao


The constructor of org.apache.hadoop.io.compress.CompressorStream requires an 
org.apache.hadoop.io.compress.Compressor  object to compress bytes but it does 
not invoke the compressor's finish method when close method are called. This 
may causes the native memory leaks if the compressor is only used by this 
CompressorStream object.

I found this when set up a flume agent with gzip compression, the native memory 
grows slowly and cannot fall back. 

{code}
  @Override
  public CompressionOutputStream createOutputStream(OutputStream out) 
    throws IOException {
    return (ZlibFactory.isNativeZlibLoaded(conf)) ?
               new CompressorStream(out, createCompressor(),
                                    conf.getInt("io.file.buffer.size", 4*1024)) 
:
               new GzipOutputStream(out);
  }

  @Override
  public Compressor createCompressor() {
    return (ZlibFactory.isNativeZlibLoaded(conf))
      ? new GzipZlibCompressor(conf)
      : null;
  }
{code}

The method of CompressorStream is

{code}
  @Override
  public void close() throws IOException {
    if (!closed) {
      finish();
      out.close();
      closed = true;
    }
  }

  @Override
  public void finish() throws IOException {
    if (!compressor.finished()) {
      compressor.finish();
      while (!compressor.finished()) {
        compress();
      }
    }
  }
{code}

No one will end the compressor.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to