Thanks for the help.

On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen <rogc...@ucdavis.edu> wrote:

> The binary file will allow you to pass the output from the first reducer to
> the second mapper. For example, if you outputed Text, IntWritable from the
> first one in SequenceFileOutputFormat, then you are able to retrieve Text,
> IntWritable input at the head of the second mapper. The idea of chaining is
> that you know what kind of output the first reducer is going to give
> already, and that you want to perform some secondary operation on it.
>
> One last thing on chaining jobs: it's often worth looking to see if you can
> consolidate all of your separate map and reduce tasks into a single
> map/reduce operation. There are many situations where it is more intuitive
> to write a number of map/reduce operations and chain them together, but more
> efficient to have just a single operation.
>
>
>
> On Mon, Sep 5, 2011 at 12:21 PM, ilyal levin <nipponil...@gmail.com>wrote:
>
>> Thanks for the reply.
>> I tried it but it creates a binary file which i can not understand (i need
>> the result of the first job).
>> The other thing is how can i use this file in the next chained mapper? i.e
>> how can i retrieve the keys and the values in the map function?
>>
>>
>> Ilyal
>>
>>
>> On Mon, Sep 5, 2011 at 7:41 PM, Joey Echeverria <j...@cloudera.com>wrote:
>>
>>> Have you tried SequenceFileOutputFormat and SequenceFileInputFormat?
>>>
>>> -Joey
>>>
>>> On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin <nipponil...@gmail.com>
>>> wrote:
>>> > Hi
>>> > I'm trying to write a chained mapreduce program. i'm doing so with a
>>> simple
>>> > loop where in each iteration i
>>> > create a job ,execute it and every time the current job's output is the
>>> next
>>> > job's input.
>>> > how can i configure the outputFormat of the current job and the
>>> inputFormat
>>> > of the next job so that
>>> > i will not use the TextInputFormat (TextOutputFormat), because if i do
>>> use
>>> > it, i need to parse the input file in the Map function?
>>> > i.e if possible i want the next job to "consider" the input file as
>>> > <key,value> and not plain Text.
>>> > Thanks a lot.
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>>>
>>
>>
>
>
> --
> Roger Chen
> UC Davis Genome Center
>

Reply via email to