[
https://issues.apache.org/jira/browse/HADOOP-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom White updated HADOOP-1231:
------------------------------
Attachment: MapReduceTypes.html
Due to the problems with erasure mentioned above I don't think we can generify
JobConf. This means that the compile-time type-safety checking is lost.
However, Map Reduce applications are still clearer as the types are explicit so
casts aren't needed, and some runtime checking will be supported.
There are 6 type parameters: K1, V1, K2, V2, K3, V3, related in the familiar
Map Reduce way:
{noformat}
map: (K1, V1) -> list(K2, V2)
reduce: (K2, list(V2)) -> list(K3, V3)
{noformat}
I have attached a table which shows which configuration properties are
constrained by which types.
This picture is further complicated by the fact that it is not possible to
always infer type parameters at runtime - the erasure problem (so e.g. we can't
infer the key type for LongSumReducer).
The fact that the configuration properties are constrained in complex ways and
the effect of erasure mean it's hard to devise simple rules for users to figure
out how types in their jobs would be inferred. So I don't think we should try
to infer the types for a job, rather we should only check them for consistency
(at runtime).
Furthermore, I propose doing this consistency checking as a separate Jira,
leaving this one to deal with generifying the Map Reduce public API (which in
itself is quite a big change).
Thoughts?
> Add generics to Mapper and Reducer interfaces
> ---------------------------------------------
>
> Key: HADOOP-1231
> URL: https://issues.apache.org/jira/browse/HADOOP-1231
> Project: Hadoop
> Issue Type: Improvement
> Components: mapred
> Reporter: Owen O'Malley
> Assignee: Tom White
> Attachments: HADOOP-1231.patch, MapReduceTypes.html
>
>
> By making the input and output types of the Mapper and Reducers generic, we
> can get the information from the classes and not require the user to set them
> in the configuration.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.