[ https://issues.apache.org/jira/browse/AVRO-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019552#comment-13019552 ]
ey-chih chow commented on AVRO-792: ----------------------------------- Thanks. I tested the third patch under our environment. Unfortunately, this did not fix the problem. What follows is the trace from our VM. =============================================================================================================================== cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200 etl_out avro/ngpipes-events.avdl Input Path => input/etl/test_avro_bugfix/2011-04-12/0200 Log Start Time => 2011:04:12:02 Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03 Output Path => etl_out Fetching From URL => http://partner.plusplus.com/admin/products.json isProduction => false 11/04/12 10:18:14 INFO etl.NgEventETLJob: Setting plus.json.games.table 11/04/12 10:18:14 INFO mapred.FileInputFormat: Total input paths to process : 4 11/04/12 10:18:15 INFO mapred.JobClient: Running job: job_201104081805_0001 11/04/12 10:18:16 INFO mapred.JobClient: map 0% reduce 0% 11/04/12 10:18:28 INFO mapred.JobClient: map 20% reduce 0% 11/04/12 10:18:29 INFO mapred.JobClient: map 40% reduce 0% 11/04/12 10:18:35 INFO mapred.JobClient: map 80% reduce 0% 11/04/12 10:18:39 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:18:43 INFO mapred.JobClient: map 100% reduce 26% 11/04/12 10:18:46 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_0, Status : FAILED 11/04/12 10:18:47 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:18:57 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_1, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) 11/04/12 10:19:05 INFO mapred.JobClient: map 100% reduce 26% 11/04/12 10:19:08 INFO mapred.JobClient: Task Id : attempt_201104081805_0001_r_000000_2, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) 11/04/12 10:19:10 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:19:22 INFO mapred.JobClient: Job complete: job_201104081805_0001 11/04/12 10:19:22 INFO mapred.JobClient: Counters: 31 11/04/12 10:19:22 INFO mapred.JobClient: com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter 11/04/12 10:19:22 INFO mapred.JobClient: PLUS_EVENT=249 11/04/12 10:19:22 INFO mapred.JobClient: REV_EVENT=1 11/04/12 10:19:22 INFO mapred.JobClient: PC_REV_EVENT=1 11/04/12 10:19:22 INFO mapred.JobClient: Job Counters 11/04/12 10:19:22 INFO mapred.JobClient: Launched reduce tasks=4 11/04/12 10:19:22 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=34290 11/04/12 10:19:22 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 11/04/12 10:19:22 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 11/04/12 10:19:22 INFO mapred.JobClient: Launched map tasks=5 11/04/12 10:19:22 INFO mapred.JobClient: Data-local map tasks=5 11/04/12 10:19:22 INFO mapred.JobClient: Failed reduce tasks=1 11/04/12 10:19:22 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=53407 11/04/12 10:19:22 INFO mapred.JobClient: com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes 11/04/12 10:19:22 INFO mapred.JobClient: PLUS_SERVER=222 11/04/12 10:19:22 INFO mapred.JobClient: PLUS_CLIENT=28 11/04/12 10:19:22 INFO mapred.JobClient: FileSystemCounters 11/04/12 10:19:22 INFO mapred.JobClient: HDFS_BYTES_READ=472855 11/04/12 10:19:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1164803 11/04/12 10:19:22 INFO mapred.JobClient: com.ngmoco.ngpipes.etl.NgEventETLMapper$Event 11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_AFAM=133 11/04/12 10:19:22 INFO mapred.JobClient: ERR_NULL_VALUE=109 11/04/12 10:19:22 INFO mapred.JobClient: DISCARDED_EVENTS=1058 11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_PUBL=112 11/04/12 10:19:22 INFO mapred.JobClient: ERR_MAPPING_ASKU_TO_AFAM=676 11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_ASKU=225 11/04/12 10:19:22 INFO mapred.JobClient: ERR_EMPTY_MAP=182 11/04/12 10:19:22 INFO mapred.JobClient: ERR_OTHER=45 11/04/12 10:19:22 INFO mapred.JobClient: Map-Reduce Framework 11/04/12 10:19:22 INFO mapred.JobClient: Combine output records=0 11/04/12 10:19:22 INFO mapred.JobClient: Map input records=1281 11/04/12 10:19:22 INFO mapred.JobClient: Spilled Records=205 11/04/12 10:19:22 INFO mapred.JobClient: Map output bytes=41281 11/04/12 10:19:22 INFO mapred.JobClient: Map input bytes=468793 11/04/12 10:19:22 INFO mapred.JobClient: Combine input records=0 11/04/12 10:19:22 INFO mapred.JobClient: Map output records=205 11/04/12 10:19:22 INFO mapred.JobClient: SPLIT_RAW_BYTES=889 11/04/12 10:19:22 INFO mapred.JobClient: Job Failed: NA Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246) at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160) at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200 etl_out avro/ngpipes-events.avdl Input Path => input/etl/test_avro_bugfix/2011-04-12/0200 Log Start Time => 2011:04:12:02 Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03 Output Path => etl_out Fetching From URL => http://partner.plusplus.com/admin/products.json isProduction => false 11/04/12 10:30:33 INFO etl.NgEventETLJob: Setting plus.json.games.table 11/04/12 10:30:34 INFO mapred.FileInputFormat: Total input paths to process : 4 11/04/12 10:30:34 INFO mapred.JobClient: Running job: job_201104081805_0002 11/04/12 10:30:35 INFO mapred.JobClient: map 0% reduce 0% 11/04/12 10:30:44 INFO mapred.JobClient: map 40% reduce 0% 11/04/12 10:30:51 INFO mapred.JobClient: map 60% reduce 0% 11/04/12 10:30:52 INFO mapred.JobClient: map 80% reduce 0% 11/04/12 10:30:55 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:31:00 INFO mapred.JobClient: map 100% reduce 33% 11/04/12 10:31:03 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_0, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) 11/04/12 10:31:04 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:31:11 INFO mapred.JobClient: map 100% reduce 33% 11/04/12 10:31:14 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_1, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) 11/04/12 10:31:16 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:31:26 INFO mapred.JobClient: Task Id : attempt_201104081805_0002_r_000000_2, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) at org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) at org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39) at com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) at org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) 11/04/12 10:31:34 INFO mapred.JobClient: map 100% reduce 13% 11/04/12 10:31:38 INFO mapred.JobClient: map 100% reduce 0% 11/04/12 10:31:38 INFO mapred.JobClient: Job complete: job_201104081805_0002 11/04/12 10:31:38 INFO mapred.JobClient: Counters: 31 11/04/12 10:31:38 INFO mapred.JobClient: com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter 11/04/12 10:31:38 INFO mapred.JobClient: PLUS_EVENT=249 11/04/12 10:31:38 INFO mapred.JobClient: REV_EVENT=1 11/04/12 10:31:38 INFO mapred.JobClient: PC_REV_EVENT=1 11/04/12 10:31:38 INFO mapred.JobClient: Job Counters 11/04/12 10:31:38 INFO mapred.JobClient: Launched reduce tasks=4 11/04/12 10:31:38 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=32438 11/04/12 10:31:38 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 11/04/12 10:31:38 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 11/04/12 10:31:38 INFO mapred.JobClient: Launched map tasks=5 11/04/12 10:31:38 INFO mapred.JobClient: Data-local map tasks=5 11/04/12 10:31:38 INFO mapred.JobClient: Failed reduce tasks=1 11/04/12 10:31:38 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=52425 11/04/12 10:31:38 INFO mapred.JobClient: com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes 11/04/12 10:31:38 INFO mapred.JobClient: PLUS_SERVER=222 11/04/12 10:31:38 INFO mapred.JobClient: PLUS_CLIENT=28 11/04/12 10:31:38 INFO mapred.JobClient: FileSystemCounters 11/04/12 10:31:38 INFO mapred.JobClient: HDFS_BYTES_READ=472855 11/04/12 10:31:38 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1164803 11/04/12 10:31:38 INFO mapred.JobClient: com.ngmoco.ngpipes.etl.NgEventETLMapper$Event 11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_AFAM=133 11/04/12 10:31:38 INFO mapred.JobClient: ERR_NULL_VALUE=109 11/04/12 10:31:38 INFO mapred.JobClient: DISCARDED_EVENTS=1058 11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_PUBL=112 11/04/12 10:31:38 INFO mapred.JobClient: ERR_MAPPING_ASKU_TO_AFAM=676 11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_ASKU=225 11/04/12 10:31:38 INFO mapred.JobClient: ERR_EMPTY_MAP=182 11/04/12 10:31:38 INFO mapred.JobClient: ERR_OTHER=45 11/04/12 10:31:38 INFO mapred.JobClient: Map-Reduce Framework 11/04/12 10:31:38 INFO mapred.JobClient: Combine output records=0 11/04/12 10:31:38 INFO mapred.JobClient: Map input records=1281 11/04/12 10:31:38 INFO mapred.JobClient: Spilled Records=205 11/04/12 10:31:38 INFO mapred.JobClient: Map output bytes=41281 11/04/12 10:31:38 INFO mapred.JobClient: Map input bytes=468793 11/04/12 10:31:38 INFO mapred.JobClient: Combine input records=0 11/04/12 10:31:38 INFO mapred.JobClient: Map output records=205 11/04/12 10:31:38 INFO mapred.JobClient: SPLIT_RAW_BYTES=889 11/04/12 10:31:38 INFO mapred.JobClient: Job Failed: NA Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246) at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160) at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) =========================================================================================================================== > map reduce job for avro 1.5 generates ArrayIndexOutOfBoundsException > -------------------------------------------------------------------- > > Key: AVRO-792 > URL: https://issues.apache.org/jira/browse/AVRO-792 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.5.0 > Environment: Mac with VMWare running Linux training-vm-Ubuntu > Reporter: ey-chih chow > Assignee: Thiruvalluvan M. G. > Priority: Blocker > Fix For: 1.5.1 > > Attachments: AVRO-792-2.patch, AVRO-792-3.patch, AVRO-792.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > We have an avro map/reduce job used to be working with avro 1.4, but broken > with avro 1.5. The M/R job with avro 1.5 worked fine under our debugging > environment, but broken when we moved to a real cluster. At one instance f > testing, the job had 23 reducers. Four of them succeeded and the rest failed > because of the ArrayIndexOutOfBoundsException generated. Here are two > instances of the stack traces: > ================================================================================= > java.lang.ArrayIndexOutOfBoundsException: -1576799025 > at > org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) > at > org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) > at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) > at > org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > at > org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:232) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) > at > org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) > at > org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) > at > org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) > at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) > at > org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) > at > com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:46) > at > com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1) > at > org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) > at > org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) > at org.apache.hadoop.mapred.Child$4.run(Child.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:234) > ===================================================================================================== > java.lang.ArrayIndexOutOfBoundsException: 40 > at > org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) > at > org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) > at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) > at > org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129) > at > org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86) > at > org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68) > at > org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136) > at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246) > at > org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242) > at > org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47) > at > com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:74) > at > com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:1) > at > org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60) > at > org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) > at org.apache.hadoop.mapred.Child$4.run(Child.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:234) > ===================================================================================================== > The signature of our map() is: > public void map(Utf8 input, AvroCollector<Pair<Utf8, GenericRecord>> > collector, Reporter reporter) throws IOException; > and reduce() is: > public void reduce(Utf8 key, Iterable<GenericRecord> values, > AvroCollector<GenericRecord> collector, Reporter reporter) throws IOException; > All the GenericRecords are of the same schema. > There are many changes in the area of serialization/de-serailization between > avro 1.4 and 1.5, but could not figure out why the exceptions were generated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira