[ 
https://issues.apache.org/jira/browse/AVRO-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019552#comment-13019552
 ] 

ey-chih chow commented on AVRO-792:
-----------------------------------

Thanks.  I tested the third patch under our environment.  Unfortunately, this 
did not fix the problem.  What follows is the trace from our VM.

===============================================================================================================================

cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar 
com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200 
etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:18:14 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:18:14 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:18:15 INFO mapred.JobClient: Running job: job_201104081805_0001
11/04/12 10:18:16 INFO mapred.JobClient:  map 0% reduce 0%
11/04/12 10:18:28 INFO mapred.JobClient:  map 20% reduce 0%
11/04/12 10:18:29 INFO mapred.JobClient:  map 40% reduce 0%
11/04/12 10:18:35 INFO mapred.JobClient:  map 80% reduce 0%
11/04/12 10:18:39 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:18:43 INFO mapred.JobClient:  map 100% reduce 26%
11/04/12 10:18:46 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0001_r_000000_0, Status : FAILED
11/04/12 10:18:47 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:18:57 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0001_r_000000_1, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
        at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
        at 
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:19:05 INFO mapred.JobClient:  map 100% reduce 26%
11/04/12 10:19:08 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0001_r_000000_2, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
        at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
        at 
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:19:10 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:19:22 INFO mapred.JobClient: Job complete: job_201104081805_0001
11/04/12 10:19:22 INFO mapred.JobClient: Counters: 31
11/04/12 10:19:22 INFO mapred.JobClient:   
com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_EVENT=249
11/04/12 10:19:22 INFO mapred.JobClient:     REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient:     PC_REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient:   Job Counters 
11/04/12 10:19:22 INFO mapred.JobClient:     Launched reduce tasks=4
11/04/12 10:19:22 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=34290
11/04/12 10:19:22 INFO mapred.JobClient:     Total time spent by all reduces 
waiting after reserving slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient:     Total time spent by all maps 
waiting after reserving slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient:     Launched map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient:     Data-local map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient:     Failed reduce tasks=1
11/04/12 10:19:22 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=53407
11/04/12 10:19:22 INFO mapred.JobClient:   
com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_SERVER=222
11/04/12 10:19:22 INFO mapred.JobClient:     PLUS_CLIENT=28
11/04/12 10:19:22 INFO mapred.JobClient:   FileSystemCounters
11/04/12 10:19:22 INFO mapred.JobClient:     HDFS_BYTES_READ=472855
11/04/12 10:19:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1164803
11/04/12 10:19:22 INFO mapred.JobClient:   
com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_AFAM=133
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NULL_VALUE=109
11/04/12 10:19:22 INFO mapred.JobClient:     DISCARDED_EVENTS=1058
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_PUBL=112
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_NO_ASKU=225
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_EMPTY_MAP=182
11/04/12 10:19:22 INFO mapred.JobClient:     ERR_OTHER=45
11/04/12 10:19:22 INFO mapred.JobClient:   Map-Reduce Framework
11/04/12 10:19:22 INFO mapred.JobClient:     Combine output records=0
11/04/12 10:19:22 INFO mapred.JobClient:     Map input records=1281
11/04/12 10:19:22 INFO mapred.JobClient:     Spilled Records=205
11/04/12 10:19:22 INFO mapred.JobClient:     Map output bytes=41281
11/04/12 10:19:22 INFO mapred.JobClient:     Map input bytes=468793
11/04/12 10:19:22 INFO mapred.JobClient:     Combine input records=0
11/04/12 10:19:22 INFO mapred.JobClient:     Map output records=205
11/04/12 10:19:22 INFO mapred.JobClient:     SPLIT_RAW_BYTES=889
11/04/12 10:19:22 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar 
com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200 
etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:30:33 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:30:34 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:30:34 INFO mapred.JobClient: Running job: job_201104081805_0002
11/04/12 10:30:35 INFO mapred.JobClient:  map 0% reduce 0%
11/04/12 10:30:44 INFO mapred.JobClient:  map 40% reduce 0%
11/04/12 10:30:51 INFO mapred.JobClient:  map 60% reduce 0%
11/04/12 10:30:52 INFO mapred.JobClient:  map 80% reduce 0%
11/04/12 10:30:55 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:00 INFO mapred.JobClient:  map 100% reduce 33%
11/04/12 10:31:03 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0002_r_000000_0, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
        at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
        at 
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:04 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:11 INFO mapred.JobClient:  map 100% reduce 33%
11/04/12 10:31:14 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0002_r_000000_1, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
        at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
        at 
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:16 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:26 INFO mapred.JobClient: Task Id : 
attempt_201104081805_0002_r_000000_2, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
        at 
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
        at 
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
        at 
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
        at 
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
        at 
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:234)

11/04/12 10:31:34 INFO mapred.JobClient:  map 100% reduce 13%
11/04/12 10:31:38 INFO mapred.JobClient:  map 100% reduce 0%
11/04/12 10:31:38 INFO mapred.JobClient: Job complete: job_201104081805_0002
11/04/12 10:31:38 INFO mapred.JobClient: Counters: 31
11/04/12 10:31:38 INFO mapred.JobClient:   
com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_EVENT=249
11/04/12 10:31:38 INFO mapred.JobClient:     REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient:     PC_REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient:   Job Counters 
11/04/12 10:31:38 INFO mapred.JobClient:     Launched reduce tasks=4
11/04/12 10:31:38 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=32438
11/04/12 10:31:38 INFO mapred.JobClient:     Total time spent by all reduces 
waiting after reserving slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient:     Total time spent by all maps 
waiting after reserving slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient:     Launched map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient:     Data-local map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient:     Failed reduce tasks=1
11/04/12 10:31:38 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=52425
11/04/12 10:31:38 INFO mapred.JobClient:   
com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_SERVER=222
11/04/12 10:31:38 INFO mapred.JobClient:     PLUS_CLIENT=28
11/04/12 10:31:38 INFO mapred.JobClient:   FileSystemCounters
11/04/12 10:31:38 INFO mapred.JobClient:     HDFS_BYTES_READ=472855
11/04/12 10:31:38 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1164803
11/04/12 10:31:38 INFO mapred.JobClient:   
com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_AFAM=133
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NULL_VALUE=109
11/04/12 10:31:38 INFO mapred.JobClient:     DISCARDED_EVENTS=1058
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_PUBL=112
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_NO_ASKU=225
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_EMPTY_MAP=182
11/04/12 10:31:38 INFO mapred.JobClient:     ERR_OTHER=45
11/04/12 10:31:38 INFO mapred.JobClient:   Map-Reduce Framework
11/04/12 10:31:38 INFO mapred.JobClient:     Combine output records=0
11/04/12 10:31:38 INFO mapred.JobClient:     Map input records=1281
11/04/12 10:31:38 INFO mapred.JobClient:     Spilled Records=205
11/04/12 10:31:38 INFO mapred.JobClient:     Map output bytes=41281
11/04/12 10:31:38 INFO mapred.JobClient:     Map input bytes=468793
11/04/12 10:31:38 INFO mapred.JobClient:     Combine input records=0
11/04/12 10:31:38 INFO mapred.JobClient:     Map output records=205
11/04/12 10:31:38 INFO mapred.JobClient:     SPLIT_RAW_BYTES=889
11/04/12 10:31:38 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
===========================================================================================================================
  

> map reduce job for avro 1.5 generates ArrayIndexOutOfBoundsException
> --------------------------------------------------------------------
>
>                 Key: AVRO-792
>                 URL: https://issues.apache.org/jira/browse/AVRO-792
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.0
>         Environment: Mac with VMWare running Linux training-vm-Ubuntu
>            Reporter: ey-chih chow
>            Assignee: Thiruvalluvan M. G.
>            Priority: Blocker
>             Fix For: 1.5.1
>
>         Attachments: AVRO-792-2.patch, AVRO-792-3.patch, AVRO-792.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> We have an avro map/reduce job used to be working with avro 1.4, but broken 
> with avro 1.5.  The M/R job with avro 1.5 worked fine under our debugging 
> environment, but broken when we moved to a real cluster.  At one instance f 
> testing, the job had 23 reducers.  Four of them succeeded and the rest failed 
> because of the ArrayIndexOutOfBoundsException generated.  Here are two 
> instances of the stack traces:
> =================================================================================
> java.lang.ArrayIndexOutOfBoundsException: -1576799025
>       at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
>       at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
>       at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>       at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>       at 
> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:232)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>       at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>       at 
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
>       at 
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
>       at 
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
>       at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
>       at 
> org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
>       at 
> com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:46)
>       at 
> com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
>       at 
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
>       at 
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>       at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> java.lang.ArrayIndexOutOfBoundsException: 40
>       at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
>       at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
>       at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>       at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>       at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>       at 
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
>       at 
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
>       at 
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
>       at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
>       at 
> org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
>       at 
> com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:74)
>       at 
> com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:1)
>       at 
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
>       at 
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>       at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> The signature of our map() is:
> public void map(Utf8 input, AvroCollector<Pair<Utf8, GenericRecord>> 
> collector, Reporter reporter) throws IOException;
> and reduce() is:
> public void reduce(Utf8 key, Iterable<GenericRecord> values, 
> AvroCollector<GenericRecord> collector, Reporter reporter) throws IOException;
> All the GenericRecords are of the same schema.
> There are many changes in the area of serialization/de-serailization between 
> avro 1.4 and 1.5, but could not figure out why the exceptions were generated. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to