I think the expected behavior of AvroStorage is to use the tuple-ordered fields in the order they exist in the tuple. So to fix your problem, swap the order of b/nonsense_name.
Otherwise I can't see a way to map from b to nonsense_name at all. Pig can't know how to do that without referencing tuple field order. On Sat, Nov 16, 2013 at 7:42 PM, Ruslan Al-Fakikh <metarus...@gmail.com>wrote: > including this last message to pig user list > > > On Sun, Nov 17, 2013 at 7:40 AM, Ruslan Al-Fakikh <metarus...@gmail.com>wrote: > >> Russel, >> >> Actually this problem came from the situation when I had the same names >> in pig relation schema and avro schema. And it turned out that AvroStorage >> switches fields if the order is different. >> So, my impression is that it should work this way: >> 1) names correspond - then AvroStorage uses them >> 2) names do not correspond - then AvroStorage fails to store or does some >> schema resolution as shown here: >> http://avro.apache.org/docs/1.7.5/spec.html#Schema+Resolution >> >> Thanks >> >> >> On Sun, Nov 17, 2013 at 7:17 AM, Russell Jurney <russell.jur...@gmail.com >> > wrote: >> >>> How can pig map from a to nonsence_name? >>> >>> >>> On Saturday, November 16, 2013, Ruslan Al-Fakikh wrote: >>> >>>> Thanks, Russel! >>>> >>>> Do you mean that this is the expected behavior? Shouldn't AvroStorage >>>> map the pig fields by their names (not their field order) matching them to >>>> the names in the avro schema? >>>> >>>> Thanks, >>>> Ruslan Al-Fakikh >>>> >>>> >>>> On Sun, Nov 17, 2013 at 6:53 AM, Russell Jurney < >>>> russell.jur...@gmail.com> wrote: >>>> >>>>> Pig tuples have field order. Swap the order of the fields in your avro >>>>> schema and try again. >>>>> >>>>> On Nov 16, 2013, at 6:19 PM, Ruslan Al-Fakikh <metarus...@gmail.com> >>>>> wrote: >>>>> >>>>> Hey guys, >>>>> >>>>> When I store with AvroStorage, the names from Pig tuple fields are >>>>> completely ignored. The field values are put to the result file only by >>>>> their position. >>>>> Here is a simplified test case: >>>>> >>>>> %declare WORKDIR `pwd` >>>>> REGISTER ../../../../lib/external/avro-1.7.4.jar >>>>> REGISTER ../../../../lib/external/json-simple-1.1.jar >>>>> --this is build (manually with Maven) from the latest source: >>>>> -- >>>>> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/ >>>>> REGISTER ../piggybankBuiltFromSource.jar >>>>> REGISTER ../../../../lib/external/jackson-core-asl-1.8.8.jar >>>>> REGISTER ../../../../lib/external/jackson-mapper-asl-1.8.8.jar >>>>> >>>>> --$ cat input.txt >>>>> --data_a data_b >>>>> --data_a data_b >>>>> inputs = LOAD 'input.txt' AS (a: chararray, b: chararray); >>>>> >>>>> DESCRIBE inputs; >>>>> DUMP inputs; >>>>> >>>>> --output: >>>>> --inputs: {a: chararray,b: chararray} >>>>> --(data_a,data_b) >>>>> --(data_a,data_b) >>>>> >>>>> STORE inputs INTO 'output' >>>>> USING org.apache.pig.piggybank.storage.avro.AvroStorage('{ >>>>> "schema": >>>>> { >>>>> "type" : "record", >>>>> "name" : "my_schema", >>>>> "namespace" : "com.my_namespace", >>>>> "fields" : [ >>>>> { >>>>> "name" : "b", >>>>> "type" : "string" >>>>> }, >>>>> { >>>>> "name" : "nonsense_name", >>>>> "type" : "string" >>>>> } >>>>> ] >>>>> } >>>>> }'); >>>>> >>>>> --output >>>>> --$ java -jar ../../../../lib/external/avro-tools-1.7.4.jar tojson >>>>> output/part* >>>>> --{"b":"data_a","nonsense_name":"data_b"} >>>>> --{"b":"data_a","nonsense_name":"data_b"} >>>>> >>>>> AvroStorage is build from the latest piggybank code. >>>>> Using AvroStorage "debug": 5 parameter didn't help. >>>>> >>>>> $ pig -version >>>>> Apache Pig version 0.11.0-cdh4.3.0 (rexported) >>>>> compiled May 27 2013, 20:48:21 >>>>> >>>>> Any help would be appreciated. >>>>> >>>>> Thanks, >>>>> Ruslan Al-Fakikh >>>>> >>>>> >>>> >>> >>> -- >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome >>> .com >>> >> >> > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com