[ https://issues.apache.org/jira/browse/OOZIE-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082479#comment-14082479 ]
Robert Kanter commented on OOZIE-1954: -------------------------------------- Test failures unrelated > Add a way for the MapReduce action to be configured by Java code > ---------------------------------------------------------------- > > Key: OOZIE-1954 > URL: https://issues.apache.org/jira/browse/OOZIE-1954 > Project: Oozie > Issue Type: New Feature > Affects Versions: trunk > Reporter: Robert Kanter > Assignee: Robert Kanter > Attachments: OOZIE-1954.patch > > > With certain other components (e.g. Avro, HFileOutputFormat (HBase), etc), it > becomes impractical to use the MapReduce action and users must instead use > the Java action. The problem is that these components require a lot of extra > configuration that is often hidden from the user in Java code (e.g. > HFileOutputFormat.configureIncrementalLoad(job, table); which can also > include decision logic, serialization, and other things that we can't do in > an XML file directly. > One way to solve this problem is to allow the user to give the MR action some > Java code that would do this configuration, similar to how we allow the > {{<job-xml>}} field to specify an external XML file of configuration > properties. > In more detail, we could have an interface; something like this: > {code} > public interface OozieActionConfigurator { > public void updateOozieActionConfiguration(Configuration conf); > } > {code} > that the user can implement, create a jar, and include with their MR action > (i.e. add a "{{<config-class>}}" field that let's them specify the class > name). To protect the Oozie server from running user code (which could do > anything it wants really), it would have to be run in the Launcher Job. The > Launcher Job could call this method after it loads the configuration prepared > by the Oozie server. > Another thing this will be helpful is with users who use the Java action to > launch MR jobs and expect a bunch of things to be done for them that are not > (e.g. delegation token propagation, config loading, returning the hadoop job > to Oozie, etc). These are all done with the MR action, so the more users we > can move to the MR action from the Java action, the less they'll run into > these difficulties. > Some of this may change slightly as I try to actually implement this (e.g. > have to handle throwing exceptions etc). And one thing I may do is keep this > general enough that it should be compatible with all action types in case we > want to add this to any of them in the future; though for now, the schema > would only accept it for the MapReduce action. -- This message was sent by Atlassian JIRA (v6.2#6252)