[jira] Commented: (HADOOP-1230) Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat, and OutputFormat classes

Alejandro Abdelnur (JIRA) Tue, 22 Jul 2008 22:25:24 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615888#action_12615888
 ]


Alejandro Abdelnur commented on HADOOP-1230:
--------------------------------------------

*On 'Interfaces vs Abstract classes':*

I won't go on the pros an cons, I'll just suggest an approach that IMO works 
pretty good.

* Use interface when the framework is injecting the object instance during 
execution, for example the propose context for the {{Mapper.map(MapContext)}} 
and {{Reducer.reduce(ReduceContext)}} . In this case, as it is the framework 
the one that always creates this objects and they are not replaceable nothing 
would break in any application code when they are extended. Take as example of 
this pattern the evolution of the Java Servlet request, response interfaces.

* Provide wrapper classes for interfaces mean to be implemented by the 
application, ensure the framework only accept application implementations that 
extends such wrapper classes. By doing this, if an interface is extended, the 
corresponding wrapper class is extended as well thus the application 
implementation will not break. Take as example of this pattern the Java Servlet 
request and response wrappers, also note that the wrappers were introduced in 
2.3, before they did not exist, so you can start with an interface and later 
provide a wrapper class for it for applications wanting to overwrite some 
methods.

* Provide abstract classes for entities that are meant to be extended by the 
application, what it is being done with {{OutputFormat}} is a good example of 
this.

* Provide concrete classes for entities that provide cooked functionality like 
{{Configuration}} and {{JobConf}}.

* Provide interfaces for lifecycle management, what is being done with 
{{Configurable}}.

*On the use of Context for {{Mapper.map}} and {{Reducer.reduce}} :*

Have you considered defining (like the Java Servlet API) request and response 
parameters, where the request is the INPUT (normally read-only) and the 
response is the OUTPUT?


> Replace parameters with context objects in Mapper, Reducer, Partitioner, 
> InputFormat, and OutputFormat classes
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1230
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: context-objs-2.patch, context-objs-3.patch, 
> context-objs.patch
>
>
> This is a big change, but it will future-proof our API's. To maintain 
> backwards compatibility, I'd suggest that we move over to a new package name 
> (org.apache.hadoop.mapreduce) and deprecate the old interfaces and package. 
> Basically, it will replace:
> package org.apache.hadoop.mapred;
> public interface Mapper extends JobConfigurable, Closeable {
>   void map(WritableComparable key, Writable value, OutputCollector output, 
> Reporter reporter) throws IOException;
> }
> with:
> package org.apache.hadoop.mapreduce;
> public interface Mapper extends Closable {
>   void map(MapContext context) throws IOException;
> }
> where MapContext has the methods like getKey(), getValue(), collect(Key, 
> Value), progress(), etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1230) Replace parameters with context objects in Mapper, Reducer, Partitioner, InputFormat, and OutputFormat classes

Reply via email to