So how do you plan to integrate your other modules with hadoop ? Put them in reduce phase ?
Jeff Zhang On Fri, Nov 27, 2009 at 6:37 AM, <[email protected]> wrote: > Actually I want the output can be used by other modules. So it has to read > the output from hdfs files? Or integrate these modules into map-reduce? Is > there other ways? > > -------------------------------------------------- > From: "Jeff Zhang" <[email protected]> > Sent: Friday, November 27, 2009 10:00 PM > To: <[email protected]> > Subject: Re: Store mapreduce output into my own data structures > > > Hi Liu, >> >> Why you want to store the output in memory? You can not use the output >> out >> of reducer. >> Actually at the beginning the output of reducer is in memory, and the >> OutputFormat write these data to file system or other data store. >> >> >> Jeff Zhang >> >> >> >> 2009/11/27 Liu Xianglong <[email protected]> >> >> Hi, everyone. Is there someone who uses map-reduce to store the reduce >>> output in memory. I mean, now the output path of job is set and reduce >>> outputs are stored into files under this path.(see the comments along >>> with >>> the following codes) >>> job.setOutputFormatClass(MyOutputFormat.class); >>> //can I implement my OutputFormat to store these output key-value >>> pairs >>> in my data structures, or are these other ways to do it? >>> job.setOutputKeyClass(ImmutableBytesWritable.class); >>> job.setOutputValueClass(Result.class); >>> FileOutputFormat.setOutputPath(job, outputDir); >>> >>> Is there any way to store them in some variables or data structures? >>> Then >>> how can I implement my OutputFormat? Any suggestions and codes are >>> welcomed. >>> >>> Another question: is there some way to set the number of map task? It >>> seems >>> there is no API to do this in hadoop new job APIs. I am not sure the way >>> to >>> set this number. >>> >>> Thanks! >>> >>> Best Wishes! >>> _____________________________________________________________ >>> >>> 刘祥龙 Liu Xianglong >>> >>> >>
