[ http://issues.apache.org/jira/browse/HADOOP-441?page=all ]
Arun C Murthy updated HADOOP-441: --------------------------------- Attachment: codec20060831.patch Here are the updated interfaces incorporating Doug's comments: a) The Compression{Input|Output}Streams extend {Input|Output}Streams. b) Only the CompressionInputStream.read(byte[], int, int) & CompressionOutputStream.write(byte[],int,int) are made 'abstract' to be safe, no other interfaces are fiddled with. > SequenceFile should support 'custom compressors' > ------------------------------------------------ > > Key: HADOOP-441 > URL: http://issues.apache.org/jira/browse/HADOOP-441 > Project: Hadoop > Issue Type: New Feature > Components: io > Reporter: Arun C Murthy > Assigned To: Arun C Murthy > Fix For: 0.6.0 > > Attachments: codec.patch, codec20060831.patch, > codec_updated_interfaces_20060830.patch > > > SequenceFiles should support 'custom compressors' which can be specified by > the user on creation of the file. > Readily available packages for gzip and zip (java.util.zip) are among obvious > choices to support. Of course there will be hooks so that other compressors > can be added in future as long as there is a way to construct (input/output) > streams on top of the compressor/decompressor. > The 'classname' of the 'custom compressor/decompressor' could be stored in > the header of the SequenceFile which can then be used by SequenceFile.Reader > to figure out the appropriate 'decompressor'. Thus I propose we add > constructors to SequenceFile.Writer which take in the 'classname' of the > compressor's input/output stream classes (e.g. > DeflaterOutputStream/InflaterInputStream or > GZIPOutputStream/GZIPInputStream), which acts as the hook for future > compressors/decompressors. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira