I want to get reduce output as key and value then I want to pass them to a
new reduce as input key and input value.
So is there any Map-Reduce-Reduce kind of method?
Thanks to all.
there isn't such method, you had to submit another MR.
On Mar 24, 2013 9:03 PM, Fatih Haltas fatih.hal...@nyu.edu wrote:
I want to get reduce output as key and value then I want to pass them to a
new reduce as input key and input value.
So is there any Map-Reduce-Reduce kind of method?
You seem to want to re-sort/partition your data without materializing
it onto HDFS.
Azuryy is right: There isn't a way right now and a second job (with an
identity mapper) is necessary. With YARN this is more possible to
implement into the project, though.
The newly inducted incubator project
Thank you very much.
You are right Harsh, it is exactly what i am trying to do.
I want to process my result, according to the keys and i donot spend time
writing this data to hdfs, I want to pass data as input to another reduce.
One more question then,
Creating 2 diffirent job, secondone has
Yes, just use an identity mapper (in new API, the base Mapper class
itself identity-maps, in the old API use IdentityMapper class) and set
the input path as the output path of the first job.
If you'll be ending up doing more such step-wise job chaining,
consider using Apache Oozie's workflow