Hi, Are you sure that another MR is required for eliminating some rows? Can't I just somehow eliminate from main() when I know the keys which are needed to remove?
for second one can I write first in SequenceFile format and then read it using SequenceFileRecordReader? But I cant figure how will I exactly write the code snippet? Thanks, On Wed, Apr 16, 2008 at 7:18 AM, Amar Kamat <[EMAIL PROTECTED]> wrote: > Aayush Garg wrote: > > > HI, > > Could you please suggest what classes and another better way to achieve > > this:- > > > > I am getting outputcollector in my reduce function as: > > > > void reduce(....) > > { > > output.collect(key,value); > > } > > > > Here key is Text, > > and value is Custom class type that I generated from rcc. > > > > 1. After all calls are complete to reduce function, I need to eliminate > > certain rows in this outputformat based on keys. I guess I need to store > > this outputformat in some static Map(declared in Reduce class) and need > > to > > do required operations from the Main function. Is this right approach? > > > > > I think you need to run another MR job for doing this record filtering. > > > 2. This stored outputformat I want to use for another Map Reduce job. > > What > > classes and format should I use in the previous step so that I can > > easily > > use this as input in another program invoking MR job. > > > > > The value class should implement Writable (see > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/io/Writable.html). > You need to write your own InputFormat (see > http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Input) > that will have a custom RecordReader (see > http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#RecordReader > ). > Amar > > > Regards, > > Garg > > > > > > > > -- Aayush Garg, Phone: +41 76 482 240