[ https://issues.apache.org/jira/browse/PIG-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai reassigned PIG-5108: ------------------------------- Assignee: Daniel Dai > AvroStorage on Tez with exception on nested records > --------------------------------------------------- > > Key: PIG-5108 > URL: https://issues.apache.org/jira/browse/PIG-5108 > Project: Pig > Issue Type: Bug > Components: tez > Affects Versions: 0.16.0 > Environment: HadoopVersion: 2.6.0-cdh5.8.0 > PigVersion: 0.16.0 > TezVersion: 0.7.0 > Reporter: Sebastian Geller > Assignee: Daniel Dai > Fix For: 0.17.0, 0.16.1 > > Attachments: person-prop.avro > > > Hi, > While migrating to the latest Pig version we have seen a general issue when > using nested Avro records on Tez: > {code} > Caused by: java.io.IOException: class > org.apache.pig.impl.util.avro.AvroTupleWrapper.write called, but not > implemented yet > at > org.apache.pig.impl.util.avro.AvroTupleWrapper.write(AvroTupleWrapper.java:68) > at > org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:139) > ... > {code} > The setup is > schema > {code} > { > "fields": [ > { > "name": "id", > "type": "int" > }, > { > "name": "property", > "type": { > "fields": [ > { > "name": "id", > "type": "int" > } > ], > "name": "Property", > "type": "record" > } > } > ], > "name": "Person", > "namespace": "com.github.ouyi.avro", > "type": "record" > } > {code} > Pig script group_person.pig > {code} > loaded_person = > LOAD '$input' > USING AvroStorage(); > grouped_records = > GROUP > loaded_person BY (property.id); > STORE grouped_records > INTO '$output' > USING AvroStorage(); > {code} > sample data > {code} > {"id":1,"property":{"id":1}} > {code} > Execution on Tez > {code} > pig -x tez_local -p input=file:///usr/lib/pig/pig-0.16.0/person-prop.avro -p > output=file:///output group_person.pig > ... > Caused by: java.io.IOException: class > org.apache.pig.impl.util.avro.AvroTupleWrapper.write called, but not > implemented yet > at > org.apache.pig.impl.util.avro.AvroTupleWrapper.write(AvroTupleWrapper.java:68) > at > org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:139) > ... > {code} > Execution on mapred > {code} > pig -x local -p input=file:///usr/lib/pig/pig-0.16.0/person-prop.avro -p > output=file:///output7 group_person.pig > ... > Output(s): > Successfully stored 1 records in: "file:///output7" > ... > {code} > I am going to attach the complete log files of both runs. > I assume that the Pig script should work regardless of Tez or mapreduce? Is > there any underlying change when migrating to Tez which makes the schema > invalid? > Thanks, > Sebastian -- This message was sent by Atlassian JIRA (v6.3.4#6332)