Re: joins in map reduce

Jason Venner Mon, 30 Jun 2008 09:56:16 -0700

I have just started to try using the Join operators.

The join I am trying is this;

join isouter(tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,"Input1"),tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,"IndexedTry1"))


but I get an error

08/06/30 08:55:13 INFO mapred.FileInputFormat: Total input paths toprocess : 10Exception in thread "main" java.io.IOException: No input paths specifiedin inputatorg.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:115)

   at org.apache.hadoop.mapred.join.Parser$WNode.getSplits(Parser.java:304)
   at org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:375)

atorg.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:131)

   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:544)

I am clearly missing something basic...

       conf.setInputFormat(CompositeInputFormat.class);
       conf.setOutputPath( outputDirectory );
       conf.setOutputKeyClass(Text.class);
       conf.setOutputValueClass(Text.class);
       conf.setOutputFormat(MapFileOutputFormat.class);
       conf.setMapperClass( LeftHandJoinMapper.class );
       conf.setReducerClass( IdentityReducer.class );
       conf.setNumReduceTasks(0);

System.err.println( "join is " +CompositeInputFormat.compose("outer", SequenceFileInputFormat.class,allTables ) );conf.set("mapred.join.expr",CompositeInputFormat.compose("outer", SequenceFileInputFormat.class,allTables ));JobClient client = new JobClient();client.setConf( conf );


       RunningJob job = JobClient.runJob( conf );



Shirley Cohen wrote:

Hi,
How does one do a join operation in map reduce? Is there more than oneway to do a join? Which way works better and why?
Thanks,

Shirley

--
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>

Attributor is hiring Hadoop Wranglers and coding wizards, contact ifinterested

Re: joins in map reduce

Reply via email to