----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63875/ -----------------------------------------------------------
(Updated Nov. 24, 2017, 10:29 a.m.) Review request for oozie, Peter Bacsko and Robert Kanter. Repository: oozie-git Description ------- Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job. Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action. Changes: - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``) - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens. Diffs (updated) ----- core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165 core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38 Diff: https://reviews.apache.org/r/63875/diff/6/ Changes: https://reviews.apache.org/r/63875/diff/5-6/ Testing ------- Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used. - workflow: ``` <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5"> <start to="distcp-3a1f"/> <kill name="Kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <action name="distcp-3a1f"> <distcp xmlns="uri:oozie:distcp-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name> <value>*</value> </property> <property> <name>oozie.launcher.mapreduce.job.hdfs-servers</name> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value> </property> <property> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name> <value>remote.test2.com</value> </property> </configuration> <arg>hdfs://remote.test2.com:8020/tmp/1</arg> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg> </distcp> <ok to="End"/> <error to="Kill"/> </action> <end name="End"/> </workflow-app> ``` Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved: - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``) - regenerating service credentials - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g`` - additional configuration to enable trust between the test hadoop clusters Thanks, Attila Sasvari