[
https://issues.apache.org/jira/browse/HADOOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615888#action_12615888
]
Alejandro Abdelnur commented on HADOOP-1230:
--------------------------------------------
*On 'Interfaces vs Abstract classes':*
I won't go on the pros an cons, I'll just suggest an approach that IMO works
pretty good.
* Use interface when the framework is injecting the object instance during
execution, for example the propose context for the {{Mapper.map(MapContext)}}
and {{Reducer.reduce(ReduceContext)}} . In this case, as it is the framework
the one that always creates this objects and they are not replaceable nothing
would break in any application code when they are extended. Take as example of
this pattern the evolution of the Java Servlet request, response interfaces.
* Provide wrapper classes for interfaces mean to be implemented by the
application, ensure the framework only accept application implementations that
extends such wrapper classes. By doing this, if an interface is extended, the
corresponding wrapper class is extended as well thus the application
implementation will not break. Take as example of this pattern the Java Servlet
request and response wrappers, also note that the wrappers were introduced in
2.3, before they did not exist, so you can start with an interface and later
provide a wrapper class for it for applications wanting to overwrite some
methods.
* Provide abstract classes for entities that are meant to be extended by the
application, what it is being done with {{OutputFormat}} is a good example of
this.
* Provide concrete classes for entities that provide cooked functionality like
{{Configuration}} and {{JobConf}}.
* Provide interfaces for lifecycle management, what is being done with
{{Configurable}}.
*On the use of Context for {{Mapper.map}} and {{Reducer.reduce}} :*
Have you considered defining (like the Java Servlet API) request and response
parameters, where the request is the INPUT (normally read-only) and the
response is the OUTPUT?
> Replace parameters with context objects in Mapper, Reducer, Partitioner,
> InputFormat, and OutputFormat classes
> --------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1230
> URL: https://issues.apache.org/jira/browse/HADOOP-1230
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
> Attachments: context-objs-2.patch, context-objs-3.patch,
> context-objs.patch
>
>
> This is a big change, but it will future-proof our API's. To maintain
> backwards compatibility, I'd suggest that we move over to a new package name
> (org.apache.hadoop.mapreduce) and deprecate the old interfaces and package.
> Basically, it will replace:
> package org.apache.hadoop.mapred;
> public interface Mapper extends JobConfigurable, Closeable {
> void map(WritableComparable key, Writable value, OutputCollector output,
> Reporter reporter) throws IOException;
> }
> with:
> package org.apache.hadoop.mapreduce;
> public interface Mapper extends Closable {
> void map(MapContext context) throws IOException;
> }
> where MapContext has the methods like getKey(), getValue(), collect(Key,
> Value), progress(), etc.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.