[ https://issues.apache.org/jira/browse/PIG-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Laukik Chitnis updated PIG-1304: -------------------------------- Attachment: patch-PIG-1304-1 > Fail underlying M/R jobs when concatenated gzip and bz2 files are provided as > input > ----------------------------------------------------------------------------------- > > Key: PIG-1304 > URL: https://issues.apache.org/jira/browse/PIG-1304 > Project: Pig > Issue Type: New Feature > Affects Versions: 0.6.0 > Reporter: Viraj Bhat > Assignee: Laukik Chitnis > Fix For: 0.9.0 > > Attachments: patch-PIG-1304-1 > > > I have the following txt files which are bzipped: \t =<TAB> > {code} > $ bzcat A.txt.bz2 > 1\ta > 2\taa > $bzcat B.txt.bz2 > 1\tb > 2\tbb > $cat *.bz2 > test/mymerge.bz2 > $bzcat test/mymerge.bz2 > 1\ta > 2\taa > 1\tb > 2\tbb > $hadoop fs -put test/mymerge.bz2 /user/viraj > {code} > I now write a Pig script to print values of bz2. > {code} > A = load '/user/viraj/bzipgetmerge/mymerge.bz2' using PigStorage(); > dump A; > {code} > I get the records for the first bz2 file which I concatenated. > (1,a) > (2,aa) > My M/R jobs do not fail or throw any warning about this, just that it drops > records. Is there a way we can throw a warning or fail the underlying Map > job, can it be done in Bzip2TextInputFormat class in Pig ? -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira