Thanks for the help. On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen <rogc...@ucdavis.edu> wrote:
> The binary file will allow you to pass the output from the first reducer to > the second mapper. For example, if you outputed Text, IntWritable from the > first one in SequenceFileOutputFormat, then you are able to retrieve Text, > IntWritable input at the head of the second mapper. The idea of chaining is > that you know what kind of output the first reducer is going to give > already, and that you want to perform some secondary operation on it. > > One last thing on chaining jobs: it's often worth looking to see if you can > consolidate all of your separate map and reduce tasks into a single > map/reduce operation. There are many situations where it is more intuitive > to write a number of map/reduce operations and chain them together, but more > efficient to have just a single operation. > > > > On Mon, Sep 5, 2011 at 12:21 PM, ilyal levin <nipponil...@gmail.com>wrote: > >> Thanks for the reply. >> I tried it but it creates a binary file which i can not understand (i need >> the result of the first job). >> The other thing is how can i use this file in the next chained mapper? i.e >> how can i retrieve the keys and the values in the map function? >> >> >> Ilyal >> >> >> On Mon, Sep 5, 2011 at 7:41 PM, Joey Echeverria <j...@cloudera.com>wrote: >> >>> Have you tried SequenceFileOutputFormat and SequenceFileInputFormat? >>> >>> -Joey >>> >>> On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin <nipponil...@gmail.com> >>> wrote: >>> > Hi >>> > I'm trying to write a chained mapreduce program. i'm doing so with a >>> simple >>> > loop where in each iteration i >>> > create a job ,execute it and every time the current job's output is the >>> next >>> > job's input. >>> > how can i configure the outputFormat of the current job and the >>> inputFormat >>> > of the next job so that >>> > i will not use the TextInputFormat (TextOutputFormat), because if i do >>> use >>> > it, i need to parse the input file in the Map function? >>> > i.e if possible i want the next job to "consider" the input file as >>> > <key,value> and not plain Text. >>> > Thanks a lot. >>> > >>> > >>> > >>> >>> >>> >>> -- >>> Joseph Echeverria >>> Cloudera, Inc. >>> 443.305.9434 >>> >> >> > > > -- > Roger Chen > UC Davis Genome Center >