Thank you very much! I was confused because it seems to be ok to pass parameters to DEFINEd functions. If this does not work, it should be a syntax error trying to pass them anyway. Maybe a parser exception could be thrown?
Thanks again! Johannes Am 23.08.2012 21:02, schrieb Cheolsoo Park: > Actually, I found it in Pig manual: > > If you need to use different constructor parameters for different calls to >> the function you will need to create multiple defines – one for each >> parameter set. > > > For example, this works: > > DEFINE AvroStorageNoParam >> org.apache.pig.piggybank.storage.avro.AvroStorage(); >> DEFINE AvroStorageWithParam >> org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '{"type" : >> "map","values" : "string"}'); >> loaded_data = LOAD 'map.avro' USING *AvroStorageNoParam*; >> describe loaded_data; >> STORE loaded_data INTO 'output' USING *AvroStorageWithParam*; > > > Please see the usage section: > http://pig.apache.org/docs/r0.10.0/basic.html#define-udfs > > Thanks, > Cheolsoo > > On Thu, Aug 23, 2012 at 11:11 AM, Cheolsoo Park <cheol...@cloudera.com>wrote: > >> Hi Johannes, >> >> I was able to reproduce your error with the following Avro schema: >> >> { >>> "type" : "map", >>> "values" : "string" >>> } >> >> >> The issue is not in AvroStorage but in the DEFINE statement. >> >> DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage(); >> >> >> AvroStorage has two constructors: one with no parameter and the other with >> parameters. To define output Avro schema, the second one must be used. But >> your DEFINE statement makes the first constructor be used always, resulting >> that output Avro schema is not set. If you remove the DEFINE statement and >> use the fully qualified name of AvroStorage, everything works. For example, >> >> loaded_data = LOAD 'map.avro' USING * >>> org.apache.pig.piggybank.storage.avro.AvroStorage.AvroStorage*(); >>> describe loaded_data; >>> STORE loaded_data INTO 'output' USING * >>> org.apache.pig.piggybank.storage.avro.AvroStorage*('schema', ' >>> { >>> "type" : "map", >>> "values" : "string" >>> } >>> '); >> >> >> Now the question is why DEFINE does not work here. >> >> Thanks, >> Cheolsoo >> >> >> On Thu, Aug 23, 2012 at 8:49 AM, Johannes Schwenk < >> johannes.schw...@adition.com> wrote: >> >>> Hi all, >>> >>> I'm trying to execute the following pig script with pig-0.10.0 and yarn >>> (cdh4.0.0): >>> >>> -- >>> DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage(); >>> loaded_data = LOAD '$input' USING AvroStorage(); >>> STORE loaded_data INTO '$output' USING AvroStorage('same', '$input'); >>> -- >>> >>> I call the pig this way: >>> >>> pig >>> >>> -Dpig.additional.jars=lib/piggybank.jar:lib/json-simple-1.1.jar:lib/avro-1.5.3.jar >>> -file script.pig -param input=input.avro -param output=output.avro >>> >>> The input.avro has the following schema: >>> >>> http://pastebin.com/ZWU6qLWx >>> >>> I always get >>> >>> <file script.pig, line 3, column 0> Output Location Validation Failed >>> for: 'xxx/output.avro' More info to follow: >>> Please provide schema for Map field! >>> Details at logfile: xxx/pig_1345735999390.log >>> >>> Log excerpt: >>> >>> Please provide schema for Map field! >>> at >>> >>> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75) >>> at >>> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77) >>> at >>> >>> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) >>> at >>> >>> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at >>> org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) >>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>> at >>> >>> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45) >>> at >>> >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:293) >>> at org.apache.pig.PigServer.compilePp(PigServer.java:1316) >>> at >>> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1253) >>> at org.apache.pig.PigServer.execute(PigServer.java:1245) >>> at org.apache.pig.PigServer.executeBatch(PigServer.java:362) >>> at >>> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132) >>> at >>> >>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193) >>> at >>> >>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) >>> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) >>> at org.apache.pig.Main.run(Main.java:430) >>> at org.apache.pig.Main.main(Main.java:111) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:208) >>> Caused by: java.io.IOException: Please provide schema for Map field! >>> at >>> >>> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:110) >>> at >>> >>> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convertRecord(PigSchema2Avro.java:151) >>> at >>> >>> org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:62) >>> at >>> >>> org.apache.pig.piggybank.storage.avro.AvroStorage.checkSchema(AvroStorage.java:534) >>> at >>> >>> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65) >>> ... 22 more >>> >>> >>> I also tried to specify >>> >>> AvroStorage('{"debug": 5, "schema_file": "schema.avsc", "field22", >>> "def:pd", "field23", "def:epd"}') >>> >>> - same result. >>> >>> >>> Do you have any hints? >>> >>> Greetings, >>> Johannes Schwenk >>> >>> -- >>> Softwareentwickler (Reporting) >>> ________________________________________________________ >>> >>> ADITION technologies AG >>> Schwarzwaldstraße 78b >>> 79117 Freiburg >>> >>> http://www.adition.com >>> >>> T +49 / (0)761 / 88147 - 30 >>> F +49 / (0)761 / 88147 - 77 >>> SUPPORT +49 / (0)1805 - ADITION >>> >>> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) >>> >>> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 >>> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus >>> Schlüter >>> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer >>> UStIDNr.: DE 218 858 434 >>> >>> >> > Johannes Schwenk -- Softwareentwickler (Reporting) ________________________________________________________ ADITION technologies AG Schwarzwaldstraße 78b 79117 Freiburg http://www.adition.com T +49 / (0)761 / 88147 - 30 F +49 / (0)761 / 88147 - 77 SUPPORT +49 / (0)1805 - ADITION (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer UStIDNr.: DE 218 858 434
signature.asc
Description: OpenPGP digital signature