[
https://issues.apache.org/jira/browse/PIG-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-5056:
----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
+1. Patch committed to trunk. Thanks Nandor, Adam!
> Fix AvroStorage writing enums
> -----------------------------
>
> Key: PIG-5056
> URL: https://issues.apache.org/jira/browse/PIG-5056
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.16.0
> Reporter: Adam Szita
> Assignee: Adam Szita
> Labels: avro
> Fix For: 0.17.0
>
> Attachments: PIG-5056.patch
>
>
> Issue is observable with latest (1.8.1) Avro since it has an extra check for
> enum types that the current 1.7.5 does not care about (see here:
> https://github.com/apache/avro/blob/release-1.8.1/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumWriter.java#L163)
> This results in TestAvroStorage#testLoadRecordsWithEnums failing: Pig reads
> an Avro file with a schema containing (string,int,enum) this is then
> represented in Pig as (chararray,int,chararray) and then Pig writes this back
> to an Avro file with given schema (string,int,enum).
> {code}
> java.lang.Exception: java.io.IOException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.AvroTypeException: Not an enum: GOOD
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.io.IOException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.AvroTypeException: Not an enum: GOOD
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655)
> at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.AvroTypeException: Not an enum: GOOD
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)
> at
> org.apache.pig.impl.util.avro.AvroRecordWriter.write(AvroRecordWriter.java:115)
> at
> org.apache.pig.impl.util.avro.AvroRecordWriter.write(AvroRecordWriter.java:51)
> at org.apache.pig.builtin.AvroStorage.putNext(AvroStorage.java:520)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
> ... 18 more
> Caused by: org.apache.avro.AvroTypeException: Not an enum: GOOD
> at
> org.apache.avro.generic.GenericDatumWriter.writeEnum(GenericDatumWriter.java:164)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:106)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
> ... 22 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)