Re: 2 Reduce method in one Job

Harsh J Sun, 24 Mar 2013 07:17:18 -0700

Yes, just use an identity mapper (in new API, the base Mapper class
itself identity-maps, in the old API use IdentityMapper class) and set
the input path as the output path of the first job.


If you'll be ending up doing more such step-wise job chaining,
consider using Apache Oozie's workflow system.

On Sun, Mar 24, 2013 at 7:23 PM, Fatih Haltas <fatih.hal...@nyu.edu> wrote:
> Thank you very much.
>
> You are right Harsh, it is exactly what i am trying to do.
>
> I want to process my result, according to the keys and i donot spend time
> writing this data to hdfs, I want to pass data as input to another reduce.
>
> One more question then,
> Creating 2 diffirent job, secondone has only reduce for example, is it
> possible to pass first jobs output as argument to second job?
>
>
> On Sun, Mar 24, 2013 at 5:44 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> You seem to want to re-sort/partition your data without materializing
>> it onto HDFS.
>>
>> Azuryy is right: There isn't a way right now and a second job (with an
>> identity mapper) is necessary. With YARN this is more possible to
>> implement into the project, though.
>>
>> The newly inducted incubator project Tez sorta targets this. Its in
>> its nascent stages though (for general user use), and the website
>> should hopefully appear at
>> http://incubator.apache.org/projects/tez.html soon. Meanwhile, you can
>> read the proposal behind this project at
>> http://wiki.apache.org/incubator/TezProposal. Initial sources are at
>> https://svn.apache.org/repos/asf/incubator/tez/trunk/.
>>
>> On Sun, Mar 24, 2013 at 6:33 PM, Fatih Haltas <fatih.hal...@nyu.edu>
>> wrote:
>> > I want to get reduce output as key and value then I want to pass them to
>> > a
>> > new reduce as input key and input value.
>> >
>> > So is there any Map-Reduce-Reduce kind of method?
>> >
>> > Thanks to all.
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Re: 2 Reduce method in one Job

Reply via email to