o.k , so now i'm using SequenceFileInputFormat and SequenceFileOutputFormat
and it works fine but the output of the reducer is
now a binary file (not txt) so i can't understand the data. how can i solve
this? i need the data (in txt form ) of the Intermediate stages in the
chain.

Thanks

On Tue, Sep 6, 2011 at 1:33 AM, ilyal levin <nipponil...@gmail.com> wrote:

> Thanks for the help.
>
>
> On Mon, Sep 5, 2011 at 10:50 PM, Roger Chen <rogc...@ucdavis.edu> wrote:
>
>> The binary file will allow you to pass the output from the first reducer
>> to the second mapper. For example, if you outputed Text, IntWritable from
>> the first one in SequenceFileOutputFormat, then you are able to retrieve
>> Text, IntWritable input at the head of the second mapper. The idea of
>> chaining is that you know what kind of output the first reducer is going to
>> give already, and that you want to perform some secondary operation on it.
>>
>> One last thing on chaining jobs: it's often worth looking to see if you
>> can consolidate all of your separate map and reduce tasks into a single
>> map/reduce operation. There are many situations where it is more intuitive
>> to write a number of map/reduce operations and chain them together, but more
>> efficient to have just a single operation.
>>
>>
>>
>> On Mon, Sep 5, 2011 at 12:21 PM, ilyal levin <nipponil...@gmail.com>wrote:
>>
>>> Thanks for the reply.
>>> I tried it but it creates a binary file which i can not understand (i
>>> need the result of the first job).
>>> The other thing is how can i use this file in the next chained mapper?
>>> i.e how can i retrieve the keys and the values in the map function?
>>>
>>>
>>> Ilyal
>>>
>>>
>>> On Mon, Sep 5, 2011 at 7:41 PM, Joey Echeverria <j...@cloudera.com>wrote:
>>>
>>>> Have you tried SequenceFileOutputFormat and SequenceFileInputFormat?
>>>>
>>>> -Joey
>>>>
>>>> On Mon, Sep 5, 2011 at 11:49 AM, ilyal levin <nipponil...@gmail.com>
>>>> wrote:
>>>> > Hi
>>>> > I'm trying to write a chained mapreduce program. i'm doing so with a
>>>> simple
>>>> > loop where in each iteration i
>>>> > create a job ,execute it and every time the current job's output is
>>>> the next
>>>> > job's input.
>>>> > how can i configure the outputFormat of the current job and the
>>>> inputFormat
>>>> > of the next job so that
>>>> > i will not use the TextInputFormat (TextOutputFormat), because if i do
>>>> use
>>>> > it, i need to parse the input file in the Map function?
>>>> > i.e if possible i want the next job to "consider" the input file as
>>>> > <key,value> and not plain Text.
>>>> > Thanks a lot.
>>>> >
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Joseph Echeverria
>>>> Cloudera, Inc.
>>>> 443.305.9434
>>>>
>>>
>>>
>>
>>
>> --
>> Roger Chen
>> UC Davis Genome Center
>>
>
>

Reply via email to