RE: Question about MapReduce

Doug Meil Mon, 19 Oct 2009 07:34:27 -0700

Hi there-

I didn't see the option in the thread yet which seems pretty straightforward:


When setting up the job:

    Job job = new Job(conf, "my job");

...
            
    conf.setStrings("param", "param1");


And then in the map method:

      String paramVal = context.getConfiguration().get("param");


This is using Hadoop .20 syntax, the previous verison had a 'configure' method 
you had to implement.



-----Original Message-----
From: Something Something [mailto:[email protected]] 
Sent: Thursday, October 15, 2009 2:31 PM
To: [email protected]; Hadoop
Subject: Re: Question about MapReduce

1) I don't think TableInputFormat is useful in this case.  Looks like it's used 
for scanning columns from a single HTable.
2) TableMapReduceUtil - same problem.  Seems like this works with just one 
table.
3) JV recommended NLineInputFormat, but my parameters are not in a file.  They 
come from multiple files and are in memory.

I guess what I am looking for is something like... InMemoryInputFormat... 
similar to FileInputFormat & DbInputFormat.  There's no such class right now.

Worse comes to worst, I can write the parameters into a flat file, and use 
FileInputFormat - but that will slow down this process considerably.  Is there 
no other way?





________________________________
From: Mark Vigeant <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Thu, October 15, 2009 7:21:40 AM
Subject: RE: Question about MapReduce

There is a tableInputFormat class in 
org.apache.hadoop.hbase.mapreduce.TableInputFormat

Also, if you want to use TableMapReduceUtil you probably want to have your 
mapper function extend TableMapper.

Check out the javadocs for more info: 
http://hadoop.apache.org/hbase/docs/current/api/index.html



-----Original Message-----
From: Something Something [mailto:[email protected]] 
Sent: Thursday, October 15, 2009 1:37 AM
To: [email protected]; [email protected]
Subject: Re: Question about MapReduce

If the answer is...

TableMapReduceUtil.initTableMapperJob

I apologize for the spam.  If this isn't the right way, please let me know.  
Thanks.


--- On Wed, 10/14/09, Something Something <[email protected]> wrote:

From: Something Something <[email protected]>
Subject: Question about MapReduce
To: [email protected], [email protected]
Date: Wednesday, October 14, 2009, 10:18 PM

I would like to start a Map-Reduce job that does not read data from an input 
file or from a database.  I would like to pass 3 arguments to the Mapper & 
Reducer to work on.  Basically, these arguments are keys on the 3 different 
tables on HBase.

In other words, I don't want to use FileInputFormat or DbInputFormat because 
everything I need is already on HBase. 

How can I do this?  Please let me know.  Thanks.

RE: Question about MapReduce

Reply via email to