How can pig map from a to nonsence_name? On Saturday, November 16, 2013, Ruslan Al-Fakikh wrote:
> Thanks, Russel! > > Do you mean that this is the expected behavior? Shouldn't AvroStorage map > the pig fields by their names (not their field order) matching them to the > names in the avro schema? > > Thanks, > Ruslan Al-Fakikh > > > On Sun, Nov 17, 2013 at 6:53 AM, Russell Jurney > <russell.jur...@gmail.com<javascript:_e({}, 'cvml', > 'russell.jur...@gmail.com');> > > wrote: > >> Pig tuples have field order. Swap the order of the fields in your avro >> schema and try again. >> >> On Nov 16, 2013, at 6:19 PM, Ruslan Al-Fakikh >> <metarus...@gmail.com<javascript:_e({}, 'cvml', 'metarus...@gmail.com');>> >> wrote: >> >> Hey guys, >> >> When I store with AvroStorage, the names from Pig tuple fields are >> completely ignored. The field values are put to the result file only by >> their position. >> Here is a simplified test case: >> >> %declare WORKDIR `pwd` >> REGISTER ../../../../lib/external/avro-1.7.4.jar >> REGISTER ../../../../lib/external/json-simple-1.1.jar >> --this is build (manually with Maven) from the latest source: >> -- >> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/ >> REGISTER ../piggybankBuiltFromSource.jar >> REGISTER ../../../../lib/external/jackson-core-asl-1.8.8.jar >> REGISTER ../../../../lib/external/jackson-mapper-asl-1.8.8.jar >> >> --$ cat input.txt >> --data_a data_b >> --data_a data_b >> inputs = LOAD 'input.txt' AS (a: chararray, b: chararray); >> >> DESCRIBE inputs; >> DUMP inputs; >> >> --output: >> --inputs: {a: chararray,b: chararray} >> --(data_a,data_b) >> --(data_a,data_b) >> >> STORE inputs INTO 'output' >> USING org.apache.pig.piggybank.storage.avro.AvroStorage('{ >> "schema": >> { >> "type" : "record", >> "name" : "my_schema", >> "namespace" : "com.my_namespace", >> "fields" : [ >> { >> "name" : "b", >> "type" : "string" >> }, >> { >> "name" : "nonsense_name", >> "type" : "string" >> } >> ] >> } >> }'); >> >> --output >> --$ java -jar ../../../../lib/external/avro-tools-1.7.4.jar tojson >> output/part* >> --{"b":"data_a","nonsense_name":"data_b"} >> --{"b":"data_a","nonsense_name":"data_b"} >> >> AvroStorage is build from the latest piggybank code. >> Using AvroStorage "debug": 5 parameter didn't help. >> >> $ pig -version >> Apache Pig version 0.11.0-cdh4.3.0 (rexported) >> compiled May 27 2013, 20:48:21 >> >> Any help would be appreciated. >> >> Thanks, >> Ruslan Al-Fakikh >> >> > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com