[ https://issues.apache.org/jira/browse/PIG-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-2909: ---------------------------- Resolution: Fixed Fix Version/s: 0.11 Status: Resolved (was: Patch Available) Patch 2 plus new tests checked in. Thanks Cheolsoo. > Add a new option for ignoring corrupted files to AvroStorage load func > ---------------------------------------------------------------------- > > Key: PIG-2909 > URL: https://issues.apache.org/jira/browse/PIG-2909 > Project: Pig > Issue Type: Improvement > Components: piggybank > Affects Versions: 0.10.0 > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.11 > > Attachments: PIG-2909-2.patch, PIG-2909-avro_test_files.tar.gz, > PIG-2909.patch > > > Currently, AvroStorage load fails with AvroRuntimeException when encountering > corrupted input files. For example, > {code} > ERROR 2997: Unable to recreate exception from backed error: > java.io.IOException: org.apache.avro.AvroRuntimeException: > java.io.IOException: Invalid sync! > at > org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:283) > {code} > But it is not always desirable to fail the Pig job for bad files. It is > sometimes more useful to skip them and continue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira