Hi,

Are you sure that another MR is required for eliminating some rows? Can't I
just somehow eliminate from main() when I know the keys which are needed to
remove?

for second one can I write first in SequenceFile format and then read it
using SequenceFileRecordReader? But I cant figure how will I exactly write
the code snippet?

Thanks,


On Wed, Apr 16, 2008 at 7:18 AM, Amar Kamat <[EMAIL PROTECTED]> wrote:

> Aayush Garg wrote:
>
> > HI,
> > Could you please suggest what classes and another better way to achieve
> > this:-
> >
> > I am getting outputcollector in my reduce function as:
> >
> >  void reduce(....)
> > {
> >   output.collect(key,value);
> > }
> >
> > Here key is Text,
> > and value is Custom class type that I generated from rcc.
> >
> > 1.  After all calls are complete to reduce function, I need to eliminate
> > certain rows in this outputformat based on keys. I guess I need to store
> > this outputformat in some static Map(declared in Reduce class) and need
> > to
> > do required operations from the Main function. Is this right approach?
> >
> >
> I think you need to run another MR job for doing this record filtering.
>
> > 2.  This stored outputformat I want to use for another Map Reduce job.
> > What
> > classes and format should I use in the previous step so that I can
> > easily
> > use this as input in another program invoking MR job.
> >
> >
> The value class should implement Writable (see
> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/io/Writable.html).
> You need to write your own InputFormat (see
> http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Input)
> that will have a custom RecordReader (see
> http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#RecordReader
> ).
> Amar
>
> > Regards,
> > Garg
> >
> >
> >
>
>


-- 
Aayush Garg,
Phone: +41 76 482 240

Reply via email to