Re: IdentityMapper

Arkady Borkovsky Wed, 19 Apr 2006 21:33:57 -0700

Eric has a great point.

It is pretty common to produce a set of records in map step, group themby key in reduce step and store for future use.Whenever this data is used, it is already grouped by key andessentially ready for reduce.

Special casing for this may be a useful optimization.


-- ab

On Apr 19, 2006, at 5:34 PM, Eric Baldeschwieler wrote:

might be cool to special case a reduce on sorted input.

On Apr 18, 2006, at 12:28 PM, Doug Cutting wrote:
Stefan Groschupf wrote:
what is the reason that each job that has no mapper defined runs theIdentityMapper?Handling different formats (as discussed) between mapping andreducing is difficult.Having one job that just map in the one format and having anotherjob that just reducein a other format would be a nice workaround of the format problembut the IdentityMapper makes this workaround impossible.
Stefan,
I don't understand the problem here. Some map function is requiredfor any data to make it to reduce. IdentityMapper simply copies allmap input without altering it. How does this cause you problems?Would you prefer a NullMapper by default, that does nothing? Thatwould result in no output sent to reduce.
Thanks,

Doug

Re: IdentityMapper

Reply via email to