[jira] Commented: (MAPREDUCE-469) Support concatenated gzip and bzip2 files

David Ciemiewicz (JIRA) Tue, 15 Jun 2010 15:55:52 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879160#action_12879160
 ]


David Ciemiewicz commented on MAPREDUCE-469:
--------------------------------------------

Greg, I have yet to encounter ANYONE who doesn't consider this a bug because 
all cited reference EXPECT concatenated files to work because the work in ALL 
OTHER cited instances including gnu tools, web browsers, etc.  Can you think of 
a single instance where it would be the right thing to stop reading a 
concatenated file after the first part is read, ignoring all other concatenated 
parts. Forgive me but suggesting that we keep the existing behavior seems 
absurd because I cannot think of a single case where this would be the right 
thing to do.


> Support concatenated gzip and bzip2 files
> -----------------------------------------
>
>                 Key: MAPREDUCE-469
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-469
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Tom White
>            Assignee: Greg Roelofs
>         Attachments: grr-hadoop-common.dif.20100614c, 
> grr-hadoop-mapreduce.dif.20100614c
>
>
> When running MapReduce with concatenated gzip files as input only the first 
> part is read, which is confusing, to say the least. Concatenated gzip is 
> described in http://www.gnu.org/software/gzip/manual/gzip.html#Advanced-usage 
> and in http://www.ietf.org/rfc/rfc1952.txt. (See original report at 
> http://www.nabble.com/Problem-with-Hadoop-and-concatenated-gzip-files-to21383097.html)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-469) Support concatenated gzip and bzip2 files

Reply via email to