Robert Kanter created OOZIE-1954:
------------------------------------

             Summary: Add a way for the MapReduce action to be configured by 
Java code
                 Key: OOZIE-1954
                 URL: https://issues.apache.org/jira/browse/OOZIE-1954
             Project: Oozie
          Issue Type: New Feature
    Affects Versions: trunk
            Reporter: Robert Kanter
            Assignee: Robert Kanter


With certain other components (e.g. Avro, HFileOutputFormat (HBase), etc), it 
becomes impractical to use the MapReduce action and users must instead use the 
Java action. The problem is that these components require a lot of extra 
configuration that is often hidden from the user in Java code (e.g. 
HFileOutputFormat.configureIncrementalLoad(job, table); which can also include 
decision logic, serialization, and other things that we can't do in an XML file 
directly.

One way to solve this problem is to allow the user to give the MR action some 
Java code that would do this configuration, similar to how we allow the 
{{<job-xml>}} field to specify an external XML file of configuration properties.
In more detail, we could have an interface; something like this:
{code}
public interface OozieActionConfigurator {
     public void updateOozieActionConfiguration(Configuration conf);
}
{code}
that the user can implement, create a jar, and include with their MR action 
(i.e. add a "{{<config-class>}}" field that let's them specify the class name). 
To protect the Oozie server from running user code (which could do anything it 
wants really), it would have to be run in the Launcher Job. The Launcher Job 
could call this method after it loads the configuration prepared by the Oozie 
server.

Another thing this will be helpful is with users who use the Java action to 
launch MR jobs and expect a bunch of things to be done for them that are not 
(e.g. delegation token propagation, config loading, returning the hadoop job to 
Oozie, etc). These are all done with the MR action, so the more users we can 
move to the MR action from the Java action, the less they'll run into these 
difficulties.

Some of this may change slightly as I try to actually implement this (e.g. have 
to handle throwing exceptions etc).  And one thing I may do is keep this 
general enough that it should be compatible with all action types in case we 
want to add this to any of them in the future; though for now, the schema would 
only accept it for the MapReduce action.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to