Zhikai Hu created HDFS-17510: -------------------------------- Summary: Change of Codec configuration does not work Key: HDFS-17510 URL: https://issues.apache.org/jira/browse/HDFS-17510 Project: Hadoop HDFS Issue Type: Bug Components: compress Reporter: Zhikai Hu
In one of my projects, I need to dynamically adjust compression level for different files. However, I found that in most cases the new compression level does not take effect as expected, the old compression level continues to be used. Here is the relevant code snippet: ZStandardCodec zStandardCodec = new ZStandardCodec(); zStandardCodec.setConf(conf); conf.set("io.compression.codec.zstd.level", "5"); // level may change dynamically conf.set("io.compression.codec.zstd", zStandardCodec.getClass().getName()); writer = SequenceFile.createWriter(conf, SequenceFile.Writer.file(sequenceFilePath), SequenceFile.Writer.keyClass(LongWritable.class), SequenceFile.Writer.valueClass(BytesWritable.class), SequenceFile.Writer.compression(CompressionType.BLOCK)); The reason is SequenceFile.Writer.init() method will call CodecPool.getCompressor(codec, null) to get a compressor. If the compressor is a reused instance, the conf is not applied because it is passed as null: public static Compressor getCompressor(CompressionCodec codec, Configuration conf) { Compressor compressor = borrow(compressorPool, codec.getCompressorType()); if (compressor == null) { compressor = codec.createCompressor(); LOG.info("Got brand-new compressor ["+codec.getDefaultExtension()+"]"); } else { compressor.reinit(conf); // conf is null here if(LOG.isDebugEnabled()) { LOG.debug("Got recycled compressor"); } } Please also refer to my unit test to reproduce the bug. To address this bug, I modified the code to ensure that the configuration is read back from the codec when a compressor is reused. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org