Do I file this, or is it a dupe? I saw lots of existing tickets that look similar.
On Sun, Feb 5, 2012 at 1:53 PM, Dmitriy Ryaboy <[email protected]> wrote: > That tuple name has been named optional but I guess some places still > assume it exists. > + jon. > > On Sun, Feb 5, 2012 at 1:16 AM, Russell Jurney <[email protected] > >wrote: > > > This now seems like a bug in Utils.getSchemaFromString > > > > On Sun, Feb 5, 2012 at 1:02 AM, Russell Jurney <[email protected] > > >wrote: > > > > > To answer my own question, this is because the schemas differ. The > > schema > > > in the working case has a named tuple via AvroStorage. Storing to > Mongo > > > works when I name the tuple: > > > > > > ... > > > sent_topics = FOREACH froms GENERATE FLATTEN(group) AS (from, to), > > > pairs.subject AS pairs:bag {column:tuple (subject:chararray)}; > > > > > > STORE sent_topics INTO 'mongodb://localhost/test.pigola' USING > > > MongoStorage(); > > > > > > > > > I will stop cross-posting to myself now. > > > > > > > > > On Sun, Feb 5, 2012 at 12:47 AM, Russell Jurney < > > [email protected]>wrote: > > > > > >> sent_topics = LOAD '/tmp/pair_titles.avro' USING AvroStorage(); > > >> STORE sent_topics INTO 'mongodb://localhost/test.pigola' USING > > >> MongoStorage(); > > >> > > >> That works. Why is it the case that MongoStorage only works if the > > >> intermediate processing doesn't happen? Strangeness. > > >> > > >> On Sun, Feb 5, 2012 at 12:31 AM, Russell Jurney < > > [email protected] > > >> > wrote: > > >> > > >>> MongoStorage is failing for me now, on a script that was failing > > before. > > >>> Is anyone else using it? The schema is [from:chararray, > to:chararray, > > >>> pairs:{null:(subject:chararray)}], which worked before. > > >>> > > >>> 2012-02-05 00:27:54,991 [Thread-15] INFO > > >>> com.mongodb.hadoop.pig.MongoStorage - Store Location Config: > > >>> Configuration: core-default.xml, core-site.xml, mapred-default.xml, > > >>> mapred-site.xml, > > >>> /tmp/hadoop-rjurney/mapred/local/localRunner/job_local_0001.xml For > > URI: > > >>> mongodb://localhost/test.pigola > > >>> 2012-02-05 00:27:54,993 [Thread-15] INFO > > >>> com.mongodb.hadoop.pig.MongoStorage - OutputFormat... > > >>> com.mongodb.hadoop.MongoOutputFormat@4eb7cd92 > > >>> 2012-02-05 00:27:55,291 [Thread-15] INFO > > >>> com.mongodb.hadoop.pig.MongoStorage - Preparing to write to > > >>> com.mongodb.hadoop.output.MongoRecordWriter@333ec758 > > >>> Failed to parse: <line 1, column 35> rule identifier failed > predicate: > > >>> {!input.LT(1).getText().equalsIgnoreCase("NULL")}? > > >>> at > > >>> > > > org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:79) > > >>> at > > >>> > > > org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:93) > > >>> at org.apache.pig.impl.util.Utils.parseSchema(Utils.java:175) > > >>> at > org.apache.pig.impl.util.Utils.getSchemaFromString(Utils.java:166) > > >>> at > > >>> > > com.mongodb.hadoop.pig.MongoStorage.prepareToWrite(MongoStorage.java:186) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:125) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86) > > >>> at > > >>> > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553) > > >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) > > >>> at > > >>> > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) > > >>> 2012-02-05 00:27:55,320 [Thread-15] INFO > > >>> com.mongodb.hadoop.pig.MongoStorage - Stored Schema: > [from:chararray, > > >>> to:chararray, pairs:{null:(subject:chararray)}] > > >>> 2012-02-05 00:27:55,323 [Thread-15] WARN > > >>> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 > > >>> java.io.IOException: java.lang.NullPointerException > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:464) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:427) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:407) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:261) > > >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > > >>> at > > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) > > >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) > > >>> at > > >>> > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) > > >>> Caused by: java.lang.NullPointerException > > >>> at com.mongodb.hadoop.pig.MongoStorage.putNext(MongoStorage.java:68) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > > >>> at > > >>> > > > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508) > > >>> at > > >>> > > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > > >>> at > > >>> > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:462) > > >>> ... 7 more > > >>> > > >>> > > >>> -- > > >>> Russell Jurney > > >>> twitter.com/rjurney > > >>> [email protected] > > >>> datasyndrome.com > > >>> > > >> > > >> > > >> > > >> -- > > >> Russell Jurney > > >> twitter.com/rjurney > > >> [email protected] > > >> datasyndrome.com > > >> > > > > > > > > > > > > -- > > > Russell Jurney > > > twitter.com/rjurney > > > [email protected] > > > datasyndrome.com > > > > > > > > > > > -- > > Russell Jurney > > twitter.com/rjurney > > [email protected] > > datasyndrome.com > > > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
