[ 
https://issues.apache.org/jira/browse/HADOOP-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Heidemann resolved HADOOP-1823.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.19.0
     Release Note: bzip2 provided as codec in 0.19.0 
https://issues.apache.org/jira/browse/HADOOP-3646

> want InputFormat for bzip2 files
> --------------------------------
>
>                 Key: HADOOP-1823
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1823
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Doug Cutting
>             Fix For: 0.19.0
>
>         Attachments: bzip2.jar
>
>
> Unlike gzip, the bzip file format supports splitting.  Compression is by 
> blocks (900k by default) and blocks are separated by a synchronization marker 
> (a 48-bit approximation of Pi).  This would permit very large compressed 
> files to be split into multiple map tasks, which is not currently possible 
> unless using a Hadoop-specific file format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to