We can not read HashMap in the configure method of the reducer because it is
called before reduce job.
I need to eliminate rows from the HashMap when all the keys are read.
Also my concern is if dataset is large will this HashMap thing work??


On Wed, Apr 16, 2008 at 10:07 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:

>
> That design is fine.
>
> You should read your map in the configure method of the reducer.
>
> There is a MapFile format supported by Hadoop, but they tend to be pretty
> slow.  I usually find it better to just load my hash table by hand.  If
> you
> do this, you should use whatever format you like.
>
>
> On 4/16/08 12:41 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
>
> > HI,
> >
> > The current structure of my program is::
> > Upper class{
> > class Reduce{
> >   reduce function(K1,V1,K2,V2){
> >         // I count the frequency for each key
> >      // Add output in  HashMap(Key,value)  instead  of  output.collect()
> >    }
> >  }
> >
> > void run()
> >  {
> >       runjob();
> >      // Now eliminate top frequency keys in HashMap built in reduce
> function
> > here because only now hashmap is complete.
> >      // Write this hashmap to a file in such a format so that I can use
> this
> > hashmap in next MapReduce job and key of this hashmap is taken as key in
> > mapper function of that Map Reduce. ?? How and which format should I
> > choose??? Is this design and approach ok?
> >
> >   }
> >
> >   public static void main() {}
> > }
> > I hope you have got my question.
> >
> > Thanks,
> >
> >
> > On Wed, Apr 16, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]>
> wrote:
> >
> >> Aayush Garg wrote:
> >>
> >>> Hi,
> >>>
> >>> Are you sure that another MR is required for eliminating some rows?
> >>> Can't I
> >>> just somehow eliminate from main() when I know the keys which are
> needed
> >>> to
> >>> remove?
> >>>
> >>>
> >>>
> >> Can you provide some more details on how exactly are you filtering?
> >> Amar
> >>
> >>
> >>
>
>


-- 
Aayush Garg,
Phone: +41 76 482 240

Reply via email to