-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191804
-----------------------------------------------------------




core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java
Lines 39-42 (patched)
<https://reviews.apache.org/r/63875/#comment269716>

    It would be nice to have field level Javadoc here explaining why those are 
needed. Also linking to Hadoop repo for similar properties would be nice.



core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 59 (patched)
<https://reviews.apache.org/r/63875/#comment269718>

    Would have the `INFO` log inside the delegate method.



core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 98-111 (patched)
<https://reviews.apache.org/r/63875/#comment269717>

    An `INFO` level log message stating which tokens are obtained from where, 
similar to the other method, would be nice.



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1375-1377 (patched)
<https://reviews.apache.org/r/63875/#comment269719>

    Some `DEBUG` level logging...


- András Piros


On Nov. 22, 2017, 3:11 p.m., Attila Sasvari wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
> 
> (Updated Nov. 22, 2017, 3:11 p.m.)
> 
> 
> Review request for oozie, Peter Bacsko and Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely 
> ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation 
> tokens for HDFS nodes specified by 
> ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie 
> launcher job.
> 
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation 
> tokens to be able to copy files between secure clusters via the Oozie DistCp 
> action. 
> 
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters 
> like (``oozie.launcher.mapreduce.job.hdfs-servers`` and 
> ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse 
> ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 
> 81e28f722d9ecd0bf972bf2d0a684d207547d165 
>   core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 
> 92a7ebe9a7876b6400d80356d5c826e77575e2ab 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 
> a1df304914b73d406e986409a8053c2a48e1bd38 
> 
> 
> Diff: https://reviews.apache.org/r/63875/diff/4/
> 
> 
> Testing
> -------
> 
> Tested on a secure cluster that Oozie dist cp action can copy file from 
> another secure cluster where different Kerberos realm was used.
> 
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
>     <start to="distcp-3a1f"/>
>     <kill name="Kill">
>         <message>Action failed, error 
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>     </kill>
>     <action name="distcp-3a1f">
>         <distcp xmlns="uri:oozie:distcp-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
> 
> <configuration>
>   <property>
>     
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
>     <value>*</value>
>   </property>
> <property>
>   <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
>   <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>  
>                 <property>
>                     
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
>                     <value>remote.test2.com</value>
>                 </property>
> </configuration>
>               <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
>               <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
>         </distcp>
>         <ok to="End"/>
>         <error to="Kill"/>
>     </action>
>     <end name="End"/>
> </workflow-app>
> ```
> 
> Prior to executing the workflow I had to setup cross realm trust between the 
> test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and 
> setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as 
> ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local 
> rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
> 
> 
> Thanks,
> 
> Attila Sasvari
> 
>

Reply via email to