[ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239331#comment-13239331
 ] 

Russell Jurney commented on PIG-2614:
-------------------------------------

Currently, it still dies after I apply the patch and set thresholds. 


java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 64
        at 
org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:275)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:194)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at 
org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readRecord(PigAvroDatumReader.java:67)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
        at 
org.apache.pig.piggybank.storage.avro.PigAvroRecordReader.getCurrentValue(PigAvroRecordReader.java:80)
        at 
org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:273)
        ... 7 more

                
> AvroStorage crashes on LOADING a single bad error
> -------------------------------------------------
>
>                 Key: PIG-2614
>                 URL: https://issues.apache.org/jira/browse/PIG-2614
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10, 0.11
>            Reporter: Russell Jurney
>              Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
>             Fix For: 0.10, 0.11
>
>         Attachments: PIG-2614_0.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to