[
https://issues.apache.org/jira/browse/HADOOP-4918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659730#action_12659730
]
Zheng Shao commented on HADOOP-4918:
------------------------------------
I saw this piece of code in TestCodec.java.
Unfortunately SequenceFileWriter.BlockCompressWriter is not calling close() on
the deflateOut for each block. As a result, the codec is not working.
{code}
//Necessary to close the stream for BZip2 Codec to write its final output.
Flush is not enough.
deflateOut.close();
{code}
We will probably need to modify BZip2 Codec to make this work.
> Make bzip2 work with SequenceFile
> ---------------------------------
>
> Key: HADOOP-4918
> URL: https://issues.apache.org/jira/browse/HADOOP-4918
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Zheng Shao
> Attachments: TestSequenceFileBZip.java
>
>
> Somehow bzip2 does not work with SequenceFile:
> {code}
> String codec = "org.apache.hadoop.io.compress.BZip2Codec";
> SequenceFile.Writer writer = SequenceFile.createWriter(fs, conf, new
> Path(output),
> reader.getKeyClass(), reader.getValueClass(), CompressionType.BLOCK,
> (CompressionCodec)Class.forName(codec).newInstance());
> {code}
> The stack trace is here:
> {noformat}
> java.lang.UnsupportedOperationException
> at
> org.apache.hadoop.io.compress.BZip2Codec.getCompressorType(BZip2Codec.java:80)
> at
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:98)
> at
> org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:914)
> at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
> at
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:329)
> at
> org.apache.hadoop.mapred.TestSequenceFileBZip.main(TestSequenceFileBZip.java:43)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.