Patch posted to https://issues.apache.org/jira/browse/PIG-1748 along with the script that showed the problem.
On Tue, Dec 6, 2011 at 11:33 PM, Dmitriy Ryaboy <[email protected]> wrote: > Yea please post to pig Jira, preferably with an example of how to > reproduce the error (better yet a test that demonstrates the fix) > > On Dec 6, 2011, at 11:25 PM, Russell Jurney <[email protected]> > wrote: > > > I fixed the bug, in AvroStorageUtils.java: > > > > /** check whether it is just a wrapped tuple */ > > public static boolean isTupleWrapper(ResourceFieldSchema pigSchema) { > > System.err.println("is a wrapped tuple!"); > > Boolean status = false; > > if(pigSchema.getType() == DataType.TUPLE) > > if(pigSchema.getName() != null) > > > > if(pigSchema.getName().equals(AvroStorageUtils.PIG_TUPLE_WRAPPER)) > > status = true; > > return status; > > } > > > > The script now works. Will make a patch. Should I make a ticket? > > > > On Tue, Dec 6, 2011 at 5:36 PM, Dmitriy Ryaboy <[email protected]> > wrote: > > > >> If you send a pull to wilbur, I can merge it. But we are also still > >> supporting piggybank as wilbur never really got off the ground... > >> > >> D > >> > >> On Tue, Dec 6, 2011 at 3:47 PM, Russell Jurney < > [email protected]> > >> wrote: > >>> I'm debugging the AvroStorage UDF in piggybank for this blog post: > >>> > >> > http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby > >>> > >>> The script is: > >>> > >>> messages = LOAD '/tmp/messages.avro' USING AvroStorage(); > >>> user_groups = GROUP messages by user_id; > >>> per_user = FOREACH user_groups { > >>> sorted = ORDER messages BY message_id DESC; > >>> GENERATE group AS user_id, sorted AS messages; > >>> } > >>> DESCRIBE per_user > >>>> per_user: {user_id: int,messages: {(message_id: int,topic: > >>> chararray,user_id: int)}} > >>> STORE per_user INTO '/tmp/per_user.avro' USING AvroStorage(); > >>> > >>> The error is: > >>> > >>> Pig Stack Trace > >>> --------------- > >>> ERROR 1002: Unable to store alias per_user > >>> > >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable > to > >>> store alias per_user > >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1596) > >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) > >>> at > >> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) > >>> at > >>> > >> > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > >>> at > >>> > >> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) > >>> at > >>> > >> > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) > >>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67) > >>> at org.apache.pig.Main.run(Main.java:487) > >>> at org.apache.pig.Main.main(Main.java:108) > >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>> at > >>> > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>> at > >>> > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>> at java.lang.reflect.Method.invoke(Method.java:597) > >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > >>> Caused by: java.lang.NullPointerException > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.AvroStorageUtils.isTupleWrapper(AvroStorageUtils.java:327) > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:82) > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:105) > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convertRecord(PigSchema2Avro.java:151) > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.PigSchema2Avro.convert(PigSchema2Avro.java:62) > >>> at > >>> > >> > org.apache.pig.piggybank.storage.avro.AvroStorage.checkSchema(AvroStorage.java:502) > >>> at > >>> > >> > org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:65) > >>> at > >> > org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77) > >>> at > >>> > >> > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) > >>> at > >>> > >> > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) > >>> at > >>> > >> > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) > >>> at > >>> > >> > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) > >>> at > >>> > >> > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) > >>> at > org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) > >>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) > >>> at > >>> > >> > org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45) > >>> at > >>> > >> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:292) > >>> at org.apache.pig.PigServer.compilePp(PigServer.java:1360) > >>> at > >> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1297) > >>> at org.apache.pig.PigServer.execute(PigServer.java:1286) > >>> at org.apache.pig.PigServer.access$400(PigServer.java:125) > >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1591) > >>> ... 13 more > >>> > >>> > >>> I need to fix this. Which means I need to commit a patch to get in the > >>> current piggybank? I've got some time... is it worthwhile to resurrect > >>> wilbur on github and move piggybank over? > >>> > >>> -- > >>> Russell Jurney > >>> twitter.com/rjurney > >>> [email protected] > >>> datasyndrome.com > >> > > > > > > > > -- > > Russell Jurney > > twitter.com/rjurney > > [email protected] > > datasyndrome.com > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
