On Wed, 26 Mar 2008, Aayush Garg wrote:
> HI, > I am developing the simple inverted index program frm the hadoop. My map > function has the output: > <word, doc> > and the reducer has: > <word, list(docs)> > > Now I want to use one more mapreduce to remove stop and scrub words from Use distributed cache as Arun mentioned. > this output. Also in the next stage I would like to have short summay Whether to use a separate MR job depends on what exactly you mean by summary. If its like a window around the current word then you can possibly do it in one go. Amar > associated with every word. How should I design my program from this stage? > I mean how would I apply multiple mapreduce to this? What would be the > better way to perform this? > > Thanks, > > Regards, > - > Aayush Garg, > Phone: +41 76 482 240 >