[jira] Updated: (HADOOP-6837) Support for LZMA compression

Nicholas Carlini (JIRA) Mon, 26 Jul 2010 12:26:48 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nicholas Carlini updated HADOOP-6837:
-------------------------------------

    Attachment: hadoop-6349-2.patch

Attached an update patch.

Fixed the checksum mismatch. It was possible for the decompressor to run out of 
input after reading the header bytes but not notice if the block ID was 1. So 
if there were fewer than 26 bytes in the input (but more than 16) and the byte 
ID was 1 then it wouldn't notice and just use whatever happened to be in the 
buffer at the time.

Fixed a bug in the decompressor where it would incorrectly indicate it was 
finished if at the end of decompressing a block there was no more input left to 
decompress and decompress() was then called again (TestCodec seed 1333275328, 
2011623221, -1402938700 or -1990280158; generate 50,000 records). Actually, the 
decompressor never returns finished now. This is because the only time the 
decompressor should return true is if it somehow knows the end of the stream 
has been reached and it doesn't, it just guesses that if it has read all the 
bytes it currently has then it's done, which is not the case.

Implemented getRemaining().

Removed iOff from both the compressor and decompressor. It was initialized to 
zero from the start and was only ever modified after that by setting it to 0.

Modified TestCodec to accept a seed as an argument.

Removed the rest of the carriage returns.


I will be adding a native version over the next few days and will upload that 
patch when it's done.

> Support for LZMA compression
> ----------------------------
>
>                 Key: HADOOP-6837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6837
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>            Reporter: Nicholas Carlini
>            Assignee: Nicholas Carlini
>         Attachments: hadoop-6349-2.patch, HADOOP-6837-lzma-1-20100722.patch, 
> HADOOP-6837-lzma-c-20100719.patch, HADOOP-6837-lzma-java-20100623.patch
>
>
> Add support for LZMA (http://www.7-zip.org/sdk.html) compression, which 
> generally achieves higher compression ratios than both gzip and bzip2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-6837) Support for LZMA compression

Reply via email to