The job still needs everything else (input path, output path, mapper-class, etc.). This is trying to address the parameter-passing question of how parameters can be passed to mappers/reducers.
Correction: there is a bug in my example... Since I'm reading the parameter with a 'get() ' I should have called 'set' instead of 'setStrings'. But that's the general idea. -----Original Message----- From: Something Something [mailto:luckyguy2...@yahoo.com] Sent: Monday, October 19, 2009 11:23 AM To: general@hadoop.apache.org; hbase-u...@hadoop.apache.org Subject: Re: Question about MapReduce Interesting... I haven't tried this yet.. but in this case what would you specify as the 'InputPath'? I was under the assumption that a Job needs some kind of InputPath.. no? I don't see a NullInputPath. Is there something equivalent? ________________________________ From: Doug Meil <doug.m...@explorysmedical.com> To: "hbase-u...@hadoop.apache.org" <hbase-u...@hadoop.apache.org>; "general@hadoop.apache.org" <general@hadoop.apache.org> Sent: Mon, October 19, 2009 7:34:03 AM Subject: RE: Question about MapReduce Hi there- I didn't see the option in the thread yet which seems pretty straightforward: When setting up the job: Job job = new Job(conf, "my job"); ... conf.setStrings("param", "param1"); And then in the map method: String paramVal = context.getConfiguration().get("param"); This is using Hadoop .20 syntax, the previous verison had a 'configure' method you had to implement. -----Original Message----- From: Something Something [mailto:luckyguy2...@yahoo.com] Sent: Thursday, October 15, 2009 2:31 PM To: hbase-u...@hadoop.apache.org; Hadoop Subject: Re: Question about MapReduce 1) I don't think TableInputFormat is useful in this case. Looks like it's used for scanning columns from a single HTable. 2) TableMapReduceUtil - same problem. Seems like this works with just one table. 3) JV recommended NLineInputFormat, but my parameters are not in a file. They come from multiple files and are in memory. I guess what I am looking for is something like... InMemoryInputFormat... similar to FileInputFormat & DbInputFormat. There's no such class right now. Worse comes to worst, I can write the parameters into a flat file, and use FileInputFormat - but that will slow down this process considerably. Is there no other way? ________________________________ From: Mark Vigeant <mark.vige...@riskmetrics.com> To: "hbase-u...@hadoop.apache.org" <hbase-u...@hadoop.apache.org> Sent: Thu, October 15, 2009 7:21:40 AM Subject: RE: Question about MapReduce There is a tableInputFormat class in org.apache.hadoop.hbase.mapreduce.TableInputFormat Also, if you want to use TableMapReduceUtil you probably want to have your mapper function extend TableMapper. Check out the javadocs for more info: http://hadoop.apache.org/hbase/docs/current/api/index.html -----Original Message----- From: Something Something [mailto:luckyguy2...@yahoo.com] Sent: Thursday, October 15, 2009 1:37 AM To: general@hadoop.apache.org; hbase-u...@hadoop.apache.org Subject: Re: Question about MapReduce If the answer is... TableMapReduceUtil.initTableMapperJob I apologize for the spam. If this isn't the right way, please let me know. Thanks. --- On Wed, 10/14/09, Something Something <luckyguy2...@yahoo.com> wrote: From: Something Something <luckyguy2...@yahoo.com> Subject: Question about MapReduce To: general@hadoop.apache.org, hbase-u...@hadoop.apache.org Date: Wednesday, October 14, 2009, 10:18 PM I would like to start a Map-Reduce job that does not read data from an input file or from a database. I would like to pass 3 arguments to the Mapper & Reducer to work on. Basically, these arguments are keys on the 3 different tables on HBase. In other words, I don't want to use FileInputFormat or DbInputFormat because everything I need is already on HBase. How can I do this? Please let me know. Thanks.