[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373772#comment-16373772 ] Greg Senia commented on YARN-5910: -- [~jlowe] I was able to resolve my issue with the following: Kerberos Distcp between Secure Clusters (without Cross-realm Authentication) Two clusters with the realms: SOURCE (PROD.HDP.EXAMPLE.COM) and DESTINATION (MODL.HDP.EXAMPLE.COM) Data needs to move between SOURCE (hdfs://prod) to DESTINATION (hdfs://modl) Trust exists between SOURCE (PROD.HDP.EXAMPLE.COM and Active Directory (NT.EXAMPLE.COM), and DESTINATION (MODL.HDP.EXAMPLE.COM) and Active Directory (NT.EXAMPLE.COM). Both SOURCE (prod) and DESTINATION (modl) clusters are running a Hadoop Distro with the following Patches: https://issues.apache.org/jira/browse/HDFS-7546 and https://issues.apache.org/jira/browse/YARN-3021 Set mapreduce.job.hdfs-servers.token-renewal.exclude property to instruct ResourceManagers on either cluster to skip or perform delegation token renewal for NameNode hosts and set the dfs.namenode.kerberos.principal.pattern property to * to allow distcp irrespective of the principal patterns of the source and destination clusters Example of Command that works: hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* -Dmapreduce.job.hdfs-servers.token-renewal.exclude=modl hdfs:///public_data hdfs://modl/public_data/gss_test2 > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch, > YARN-5910.7.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373672#comment-16373672 ] Jason Lowe commented on YARN-5910: -- I think the problem here is that the token renewer, which is the ResourceManager, has no way to authenticate with the remote cluster from which the token was obtained. So that leaves us stuck with a token the user can get with their credentials but YARN cannot keep active because the RM's credentials are worthless to the other cluster. The RM always attempts to renew the delegation tokens of an application that was submitted to both ensure validity and guarantee that it will be able to keep the tokens renewed while the app either waits to be scheduled or while it is running. This is to fail fast rather than have apps run for hours then have their credentials expire because the RM could not renew them. The easiest fix is to establish valid credentials for the RM to the other cluster. If you don't want to expose the full trust between the two clusters, one workaround is to create yet another realm where the RM principals exist and there's a one-way trust from that new realm to each Hadoop cluster's realm that wishes to communicate cross-realm like this. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch, > YARN-5910.7.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373013#comment-16373013 ] Greg Senia commented on YARN-5910: -- @jlowe @jianhe or @clayb do you know if this Jira solves the problem I have below where I attempt to run distcp as my user from active directory. In this scenerio each Hadoop Cluster has its own KDC that has a one-way trust to Active Directory. In the case below I attempt to run distcp as my active directory user and it will not run. hadoop fs -cp hdfs://tech/public_data hdfs://unit/public_data/gss_test works no problem. So I am wondering if this Jira solves this problem of a cluster attempting to get a delegation token? @clayb 's questions from above I guess my question is does this solve #2? I see there are two issues here I was hoping to solve: 1. A remote cluster's services are needed (e.g. as a data source to this job) 2. A remote cluster does not trust this cluster's YARN principal Distcp between clusters that are NOT allowed to trust each others kerberos realms: hadoop distcp hdfs://tech/public_data hdfs://unit/public_data/gss_test RM Log from DISTCP? 2018-02-22 11:11:45,677 INFO resourcemanager.ClientRMService (ClientRMService.java:submitApplication(588)) - Application with id 112 submitted by user gss2002 2018-02-22 11:11:45,677 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(170)) - USER=gss2002 IP=10.70.40.255 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1519238638021_0112 CALLERCONTEXT=CLI 2018-02-22 11:11:45,679 INFO security.DelegationTokenRenewer (DelegationTokenRenewer.java:handleAppSubmitEvent(449)) - application_1519238638021_0112 found existing hdfs token Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN token 102956 for gss2002) 2018-02-22 11:11:54,526 WARN ipc.Client (Client.java:run(717)) - Couldn't setup connection for rm/ha21t52mn.tech.hdp.example@tech.hdp.example.com to ha21d51nn.unit.hdp.example.com/10.69.81.6:8020 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:597) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:399) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:768) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:764) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1740) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:764) at org.apache.hadoop.ipc.Client$Connection.access$3300(Client.java:399) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1626) at org.apache.hadoop.ipc.Client.call(Client.java:1457) at org.apache.hadoop.ipc.Client.call(Client.java:1404) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy93.renewDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewDelegationToken(ClientNamenodeProtocolTranslatorPB.java:993) at sun.reflect.GeneratedMethodAccessor214.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy94.renewDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1141) at org.apache.hadoop.security.token.Token.renew(Token.java:414) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:597) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:594) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1740) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:592) at
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832738#comment-15832738 ] Jian He commented on YARN-5910: --- testFinishedAppRemovalAfterRMRestart passed locally for me.. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch, > YARN-5910.7.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832713#comment-15832713 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 52s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 28s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 2s{color} | {color:orange} root: The patch generated 29 new + 1445 unchanged - 10 fixed = 1474 total (was 1455) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 908 unchanged - 5 fixed = 908 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832277#comment-15832277 ] Jian He commented on YARN-5910: --- bq. It's confusing that the max size check is using capacity() but the error message uses position(). missed to change that.. bq. I'm curious on the reasoning for removing the assert for NEW state? Because I feel that's obvious and not needed.. bq. TestAppManager fails consistently for me with the patch applied and passes consistently without. Please investigate. It's because the am containerLaunchContext is null in the UT which failed with NPE in the new code "submissionContext.getAMContainerSpec().getTokensConf()". I think it's ok to assume am ContainerLaunchContext being not null? As I see other code does the same in this call path, like "submissionContext.getAMContainerSpec().getApplicationACLs()" in RMAppManager. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832014#comment-15832014 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! Nit: I think it should be more clear that the regex in the documentation is just an example and not the default, e.g.: "This regex" s/b "For example the following regex". DEFAULT_RM_DELEGATION_TOKEN_MAX_SIZE doesn't match yarn-default.xml. It's confusing that the max size check is using capacity() but the error message uses position(). I'm curious on the reasoning for removing the assert for NEW state? I was unable to reproduce the TestRMRestart and TestMRIntermediateDataEncryption failures with the patch, but TestAppManager fails consistently for me with the patch applied and passes consistently without. Please investigate. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831364#comment-15831364 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 57s{color} | {color:orange} root: The patch generated 29 new + 1445 unchanged - 10 fixed = 1474 total (was 1455) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 908 unchanged - 5 fixed = 908 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 42s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831365#comment-15831365 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 2s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 57s{color} | {color:orange} root: The patch generated 29 new + 1445 unchanged - 10 fixed = 1474 total (was 1455) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 908 unchanged - 5 fixed = 908 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831123#comment-15831123 ] Jian He commented on YARN-5910: --- Thanks again for the reviews ! bq. I'd either move the regex example into the description itself done. bq. I could just specify one property with a gigantic payload good point.. thought the number of configs indirectly means the size, and was lazy at calculating the numbers.. missed this scenario.. I changed to check based on bytes. bq. I am wondering how users/admins are going to debug their settings for the new property good point.. it was there when I was debugging this feature.. I added the debug level logging in both YarnRunner and DelegationTokenRenewer uploaded a patch that addressed all comments. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch, YARN-5910.6.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830535#comment-15830535 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! Last I knew, the descriptions for properties in mapred-site.xml were programatically scraped to generate documentation. Therefore the additional comment will be a bit confusing when taken out of context with the commented value. I'd either move the regex example into the description itself or move this added comment into a separate XML comment just above the commented value. I'm surprised the max conf size is a property count rather than a overall size limit on the conf buffer being passed/persisted. After all, I could just specify one property with a gigantic payload and pass this safety check, and I thought this check was more about preventing excessive memory usage than excessive property counts. I am wondering how users/admins are going to debug their settings for the new property. I don't see any way for them to know which properties are really getting picked up. For example, if they pick up too many properties and exceed the size limit, how can they know which extra ones they are hitting? Or similarly, when token renewal fails, how can they tell what the conf looks like that was used for renewal? Wondering if we need at least a debug- or trace-level log somewhere that dumps the app-specific conf. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch, YARN-5910.5.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829306#comment-15829306 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 9s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 13s{color} | {color:orange} root: The patch generated 21 new + 1413 unchanged - 10 fixed = 1434 total (was 1423) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 908 unchanged - 5 fixed = 908 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828844#comment-15828844 ] Jian He commented on YARN-5910: --- bq. whether we may need some RM-specific configs to be able to successfully connect with kerberos. There may be some remappings that the admins only bothered to configure on the RM or are RM specific? sorry, didn't get you. The 'dfs.namenode.kerberos.principal' is actually HDFS config, not RM config. If two clusters have different DFS principal name configured, when MR client asks for the delegation token from both clusters, I guess this check will fail, because it cannot differentiate the cluster. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828831#comment-15828831 ] Jason Lowe commented on YARN-5910: -- bq. Regarding the if security enabled check in ClientRMSerivce, do you also prefer removing it ? Yes, I'd rather not fail a job that would otherwise work without this check. bq. I have one question about this design: the dfs.namenode.kerberos.principal is not differentiated by clusterId. So it assumes all clusters will have the same value for 'dfs.namenode.kerberos.principal' ? This applies to all other service including RM as well. I'll have to defer to [~daryn]'s expertise on whether we may need some RM-specific configs to be able to successfully connect with kerberos. There may be some remappings that the admins only bothered to configure on the RM or are RM specific? Not sure. It'd be nice if we didn't need the RM configs, but now I'm thinking there may be cases where we need them. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828810#comment-15828810 ] Jian He commented on YARN-5910: --- bq. Yeah, I'm thinking it's unnecessary to check both. sounds good, I'll remove the is security enabled check in YARNRunner. Regarding the if security enabled check in ClientRMSerivce, do you also prefer removing it ? bq. Configuration.addResource will add a resource object to the list of resources for the config and never get rid of them. This will cause every app-specific conf to be tracked by renewerConf forever, resulting in a memory leak. Ah, I see. Good point. I didn't understand you previous comment about this. So I've done the experiment. Actually, we don't need RM's own config for renew. Additionally, we need to pass in the dfs.namenode.kerberos.principal from the client to pass the check in SaslRpcClient#getServerPrincipal where it checks whether the remote principle equals to the local config. I have one question about this design: the dfs.namenode.kerberos.principal is not differentiated by clusterId. So when MR client asks delegation token from both clusters, it assumes all clusters will have the same value for 'dfs.namenode.kerberos.principal' ? So I can just use appConfig in DelegationTokenRenewer. I'll also add the config limit in RM. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828772#comment-15828772 ] Jason Lowe commented on YARN-5910: -- bq. Currently, the RM DelegationTokenRewener will only add the tokens if security is enabled (code in RMAppManager#submitApplication), so I think with this existing implemtation, we can assume this feature is for security enabled only ? Yeah, I'm thinking it's unnecessary to check both. This new config has no value by default. A user or admin would have to go out of their way to set it. If they did, then they expect the confs to be in the submission context. bq. On ther other hand, Varun chatted offline that we can add a limit config in RM to limit the size of configs, your opinion ? A RM-side limit for configs may not be a bad idea to avoid a problematic client or user that sets ".*" as the conf filter. ;-) It won't solve the problem of the gigantic RPC trying to come in, but at least the RM can quickly discard it before trying to persist it. bq. So, in the latest patch I changed it to let all apps share the same renewerConf - this is based on the assumption that "dfs.nameservices" must have distint keys for each distinct cluster, so we won't have situation where two apps use different configs for the same cluster - it is true that unnecessary configs used by 1st app will be shared by subsequent apps. This is bad for two reasons. One is the polluting problem -- there could be cases where the presence of a config key from a previous app is problematic for renewal in a subsequent app. The other is a memory leak. Configuration.addResource will add a resource object to the list of resources for the config and never get rid of them. This will cause every app-specific conf to be tracked by renewerConf forever, resulting in a memory leak. One solution to this is to track the (partial) app configurations separately and then make a copy of the RM's conf and merge in the partial app conf "on-demand" when it's time to renew the token for the app. Then we're not storing a full copy of the RM's configs for every app, just the parts that need to be per-app. If doing the repetitive copy and merge of the conf is too expensive then we can derive a Configuration subclass that takes the app conf and RM conf in the constructor. When we try to do property lookups it tries to find it in the app conf and falls back to the RM conf if necessary. Then we don't have to make copies and merge each time. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828654#comment-15828654 ] Jian He commented on YARN-5910: --- Hi Jason, thank you very much for the review ! bq. It's confusing to see a MR_JOB_SEND_TOKEN_CONF_DEFAULT in MRJobConfig yet it clearly is not the default value. removed it bq. Should this feature be tied to UserGroupInformation.isSecurityEnabled? I'm wondering if this can cause issues where the current cluster isn't secure but the RM needs to renew the job's tokens for a remote secure cluster or some other secure service. Seems like if this conf is set then that's all we need to know. Currently, the RM DelegationTokenRewener will only add the tokens if security is enabled (code in RMAppManager#submitApplication), so I think with this existing implemtation, we can assume this feature is for security enabled only ? bq. Similarly the code explicitly fails in ClientRMService if the conf is there when security is disabled which seems like we're taking a case that isn't optimal but should work benignly and explicitly making sure it fails. Not sure that's user friendly behavior. My intention was to prevent user from sending conf in non-secure mode(which anyways is not needed if my above reply is true), in case the conf size huge which may increase load on RM. On ther other hand, Varun chatted offline that we can add a limit config in RM to limit the size of configs, your opinion ? bq. Nit: For the ByteBuffer usage in parseCredentials and parseTokensConf, the rewind method calls seem unnecessary since we're throwing the buffers away immediately afterwards. Actually, the bytebuffer is a direct reference from the containerLaunchContext, not a copy. I think this is also required because it was specifically to solve issues in YARN-2893. bq. Should the Configuration constructor call in parseTokensConf be using the version that does not load defaults? If not then I recommend we at least allow a conf to be passed in to use as a copy constructor.Loading a new Configuration from scratch is really expensive and we should avoid it if possible. See the discussion on HADOOP-11223 for details. Good point. I actually did the same in YarnRunner#setAppConf method, but missed this place. bq. In DelegationTokenRenewer, why aren't we using the appConf as-is when renewing the tokens? I wasn't sure whether the mere appConf is enough for the connection - (Is there any kerberos related configs for RM itself are required for authentication?). Let me do some experiments, if this works, I'll just use appConf. bq. Also it looks like we're polluting subsequent app-conf renewals with prior app configurations, as well as simply leaking appConf objects as renewerConf resources infinitum. I don't see where renewerConf gets reset in-between. My previous patch made a copy of each appConf and merge with RM's conf(for the reason I wasn't sure whether RM's own conf is required) and use that for renwer. But then I think this maybe bad because every app will have its own copy of configs, which may largely increase the memory size if the number of apps is very big. So, in the latest patch I changed it to let all apps share the same renewerConf - this is based on the assumption that "dfs.nameservices" must have distint keys for each distinct cluster, so we won't have situation where two apps use different configs for the same cluster - it is true that unnecessary configs used by 1st app will be shared by subsequent apps. bq. Arguably there should be a unit tests that verifies a first app with token conf key A and a second app with token conf key B doesn't leave a situation where the renewals of the second app are polluted with conf key A. If the mere appConf works, we should be fine. Speaking of unit tests, I see where we fixed up the YARN unit tests to pass the new conf but not a new test that verifies the specified conf is used appropriately when renewing for that app and not for other apps that didn't specify a conf. Yep, I'll add the UT. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828243#comment-15828243 ] Jason Lowe commented on YARN-5910: -- Thanks for updating the patch! It's confusing to see a MR_JOB_SEND_TOKEN_CONF_DEFAULT in MRJobConfig yet it clearly is not the default value. Should this feature be tied to UserGroupInformation.isSecurityEnabled? I'm wondering if this can cause issues where the current cluster isn't secure but the RM needs to renew the job's tokens for a remote secure cluster or some other secure service. Seems like if this conf is set then that's all we need to know. Similarly the code explicitly fails in ClientRMService if the conf is there when security is disabled which seems like we're taking a case that isn't optimal but should work benignly and explicitly making sure it fails. Not sure that's user friendly behavior. Nit: For the ByteBuffer usage in parseCredentials and parseTokensConf, the rewind method calls seem unnecessary since we're throwing the buffers away immediately afterwards. Should the Configuration constructor call in parseTokensConf be using the version that does not load defaults? If not then I recommend we at least allow a conf to be passed in to use as a copy constructor. Loading a new Configuration from scratch is really expensive and we should avoid it if possible. See the discussion on HADOOP-11223 for details. In DelegationTokenRenewer, why aren't we using the appConf as-is when renewing the tokens? Also it looks like we're polluting subsequent app-conf renewals with prior app configurations, as well as simply leaking appConf objects as renewerConf resources infinitum. I don't see where renewerConf gets reset in-between. Arguably there should be a unit tests that verifies a first app with token conf key A and a second app with token conf key B doesn't leave a situation where the renewals of the second app are polluted with conf key A. Speaking of unit tests, I see where we fixed up the YARN unit tests to pass the new conf but not a new test that verifies the specified conf is used appropriately when renewing for that app and not for other apps that didn't specify a conf. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Assignee: Jian He >Priority: Minor > Attachments: YARN-5910.01.patch, YARN-5910.2.patch, > YARN-5910.3.patch, YARN-5910.4.patch > > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827358#comment-15827358 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 40s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 19s{color} | {color:orange} root: The patch generated 18 new + 1022 unchanged - 8 fixed = 1040 total (was 1030) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 908 unchanged - 5 fixed = 908 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825210#comment-15825210 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 6s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 23s{color} | {color:orange} root: The patch generated 16 new + 1022 unchanged - 8 fixed = 1038 total (was 1030) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 4 new + 913 unchanged - 0 fixed = 917 total (was 913) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 54s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 44m 45s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 33s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 23s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}274m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestAppManager | |
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765586#comment-15765586 ] Hadoop QA commented on YARN-5910: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 52s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 50s{color} | {color:orange} root: The patch generated 14 new + 949 unchanged - 8 fixed = 963 total (was 957) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 4 new + 913 unchanged - 0 fixed = 917 total (was 913) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 29s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 44s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 33s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 58s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}106m 19s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 43s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}252m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestAppManager | | | hadoop.yarn.server.resourcemanager.logaggregationstatus.TestRMAppLogAggregationStatus | | |
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764997#comment-15764997 ] Jian He commented on YARN-5910: --- Hi Clay, thanks for the feedback. bq. we could also perhaps extend the various delegation token types to only optionally include this payload? Then we the RM would only pay the price when needed for an off-cluster request? We realized that changing existing token structure might have issues regarding compatibility. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764998#comment-15764998 ] Jian He commented on YARN-5910: --- Hi Clay, thanks for the feedback. bq. we could also perhaps extend the various delegation token types to only optionally include this payload? Then we the RM would only pay the price when needed for an off-cluster request? We realized that changing existing token structure might have issues regarding compatibility. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736764#comment-15736764 ] Clay B. commented on YARN-5910: --- This conversation has been very educational for me; thank you! I am concerned still that if we do not use kerberos, the requesting user will have no way to renew tokens as themselves? If we can not authenticate as the user, won't we be unable to work when the administrators of two clusters may be different (and thus not have the same {{yarn}} user setup -- e.g. two different principals in kerberos). Can we find a solution to that issue here as well (or ensure that this issue doesn't preclude that issue)? I really like the idea that the client (human client) is responsible for specifying the resources needed, as again in a highly federated Hadoop environment, one administration group may not even know of all clusters and this allows for more agile cross-cluster usage. I see there are two issues here I was hoping to solve: 1. A remote cluster's services are needed (e.g. as a data source to this job) 2. A remote cluster does not trust this cluster's YARN principal [~jlowe] brings up some good questions and points which hit this well: {quote}I'm not sure distributing the keytab is going to be considered a reasonable thing to do in some setups. Part of the point of getting a token is to avoid needing to ship a keytab everywhere. Once we have a keytab, is there a need to have a token?{quote} If the YARN principals of each cluster are different but the user is entitled to services on both clusters is there another way around this issue? Further, while I think many shops may have the kerberos tooling to avoid shipping keytabs, some shops are heavily HBase (e.g. long running query services) dependent or streaming centric (jobs last longer than maximal token refresh periods) and thus have to use keytabs today. {quote}There's also the problem of needing to renew the token while the AM is waiting to get scheduled if the cluster is really busy. If the AM isn't running it can't renew the token. I would expect the remote-cluster resources to not be central to operating the job. E.g. we would use the local cluster for HDFS and YARN but might want to access a remote cluster's YARN. If the AM can request tokens (i.e. with a keytab or proxy kerberos credential which was refreshed by the RM) then we can request new tokens when the job is scheduled if it was hung-up longer than the renewal time; further we do not worry about exploits of custom configuration running as a privileged process but something running as a user. Regardless, are there many clusters folks see today where the scheduling time is longer than the renewal time of a delegation token? (I.e. that would be by-default one seventh of the total job's maximal runtime -- longer than a day?) {quote}My preference is to have the token be as self-descriptive as we can possibly get. Doing the ApplicationSubmissionContext thing could work for the HA case, but I could see this being a potentially non-trivial payload the RM has to bear for each app (configs can get quite large). It'd rather avoid adding that to the context for this purpose if we can do so, but if the token cannot be self-descriptive in all cases then we may not have much other choice that I can see.{quote} I agree this seems to be the sanest idea for how to get the configuration in; we could also perhaps extend the various delegation token types to only optionally include this payload? Then we the RM would only pay the price when needed for an off-cluster request? > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726204#comment-15726204 ] Allen Wittenauer commented on YARN-5910: bq. How to maintain such an unknown list is a non-trivial task in the first place. Yup... and you haven't even gotten to the part where you try to use the service for your application. This is why DNS support would be extremely useful here. Ask it where uri://haservice is located then query the host responding for that service the details. In any case, this isn't a YARN problem. This is a HADOOP problem. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723829#comment-15723829 ] Jian He commented on YARN-5910: --- bq. ideally, the token should be self-sufficient to discover the renewer address. After digging the code more for this approach, even in non-HA mode, conf is also required for things like retry settings, also the principal name is required for secure setting. Basically the Token has to selectively carry all the necessary conf for connecting to the renewer in HA, non-HA, secure scenarios. How to maintain such an unknown list is a non-trivial task in the first place. I'd prefer the passing via appSubmissionContext approach now. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713444#comment-15713444 ] Jian He commented on YARN-5910: --- Thanks for your inputs, Jason bq. Once we have a keytab, is there a need to have a token? The map/reducer task can continue to use token. bq. My preference is to have the token be as self-descriptive as we can possibly get. I agree this sounds a better approach, but it requires a lot of work in HDFS. bq. but I could see this being a potentially non-trivial payload the RM has to bear for each app In this case, we can set the conf object to null once RM gets what it wants. I'll talk with some hdfs folks to see whether this is doable on their side. Else, I think passing a conf object and then void it might be a straightfoward approach at this point. Waiting for [~daryn]'s input also. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713359#comment-15713359 ] Jason Lowe commented on YARN-5910: -- Pinging [~daryn] since I'm sure he has an opinion on this. I'm not sure distributing the keytab is going to be considered a reasonable thing to do in some setups. Part of the point of getting a token is to avoid needing to ship a keytab everywhere. Once we have a keytab, is there a need to have a token? There's also the problem of needing to renew the token while the AM is waiting to get scheduled if the cluster is really busy. If the AM isn't running it can't renew the token. My preference is to have the token be as self-descriptive as we can possibly get. Doing the ApplicationSubmissionContext thing could work for the HA case, but I could see this being a potentially non-trivial payload the RM has to bear for each app (configs can get quite large). It'd rather avoid adding that to the context for this purpose if we can do so, but if the token cannot be self-descriptive in all cases then we may not have much other choice that I can see. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703305#comment-15703305 ] Jian He commented on YARN-5910: --- To summarize, ideally, the token should be self-sufficient to discover the renewer address. But this is not the case if Hdfs is in HA mode which uses logical URI for the token service name, RM has to rely on the local hdfs config to discover the renewer address. To let RM not depend on the local hdfs config, below are possible approaches I can think of: - 1) Change the way hdfs token is constructed in HA to be self-sufficient, instead of using logical URI, probably use a comma-separated list of real address and change DFS client HA implementation all the way down to not rely on configuration. I guess this is too big a change for hdfs to be accepted. - 2) Push the token renewal responsibility to the AM itself. That is , we distribute the kerberos keytab along with the AM and let AM itself renew the token periodically, instead of RM doing the renewal. we probably write a library for this to avoid each AM write its own. - 3) Have ApplicationSubmissonContext carry a app configuration object, RM uses this configuration object for token renewal instead of local config. [~jlowe], would you mind sharing some thoughts on this ? > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700365#comment-15700365 ] Allen Wittenauer commented on YARN-5910: bq. Not fully getting your point. Yup. I'm aware of that. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700221#comment-15700221 ] Jian He commented on YARN-5910: --- bq. dtutil alias might even fix at least token renewal for the HA case. However.. Not fully getting your point.. after all, you are saying the "dtutil util/alias" functionality cannot solve the HA case, right ? If so, we need a different approach for this problem. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700161#comment-15700161 ] Allen Wittenauer commented on YARN-5910: dtutil solves it quite well for the non-HA case. dtutil alias might even fix at least token renewal for the HA case. However: putting the renewer info in the token would also only get you so far, since that configuration information would need to get propagated into other configs. It also makes the assumption that the renewer is the same as the service provider, which isn't necessarily true (with the reverse case demonstrated by the HA situation). But really, without putting in DNS resolution for the service name, Hadoop's HA implementation is flawed and this is just a symptom. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700081#comment-15700081 ] Jian He commented on YARN-5910: --- Hence, the "hadoop dtutil" cannot solve this problem then ? we need a different solution here. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15697274#comment-15697274 ] Allen Wittenauer commented on YARN-5910: Well, that's just an extension of the already known design flaws in Hadoop's default HA implementations. It's only HA if you are "inside the bubble". Lots of other things are going to break too. Tokens are just one of them. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688768#comment-15688768 ] Jian He commented on YARN-5910: --- bq. The service field is effectively the URL to use to renew and kind specifically tells what to ask that URL. This is not true if HDFS is configured in HA mode. In case of HDFS HA, the token service is only the name service ID, RM has to rely on local hdfs config to map the name service ID to the real address, which I think is what this jira is talking about. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688642#comment-15688642 ] Allen Wittenauer commented on YARN-5910: bq. RM cannot renew this token because it does not have the necessary hdfs config. The RM should be able to effective rebuild the necessary config for any cluster service that it knows about in order to attempt to renew it. The service field is effectively the URL to use to renew and kind specifically tells what to ask that URL. Anything extra would need to be provided the same way that dtutil gets it (via a class definition). bq. So, I think the "hadoop dtutil" functionality is orthogonal to this problem ? No, it's not. Given a submission with a file that contains multiple tokens, it eliminates the need to configure the RM to have multiple HDFS configurations set in the site.xml files. It allows for jobs to provide tokens for unconfigured services and necessary info to renew. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at >
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688526#comment-15688526 ] Jian He commented on YARN-5910: --- I see. But I think the problem here is not about gathering the tokens on job submission. The problem is about whether RM is able to renew them. In this case, RM cannot renew this token because it does not have the necessary hdfs config. So, I think the "hadoop dtutil" functionality is orthogonal to this problem ? > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688211#comment-15688211 ] Allen Wittenauer commented on YARN-5910: dtutil allows you to fetch, bundle and alias multiple tokens for multiple services into a single file. This eliminates the need for job submission to gather all required tokens itself. (Job setup will fail to do under specific circumstances such as side input from a third cluster service. Now humans or automated processes can do the work that YARN itself would be unable to do.) > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688123#comment-15688123 ] Jian He commented on YARN-5910: --- [~aw], I think the problem here is that RM cannot renew the delegation token because it lacks the configuration for remote hdfs HA cluster's name-service to address mapping. Could you elaborate how the "hadoop dtutil" can solve this problem ? > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
[ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1569#comment-1569 ] Allen Wittenauer commented on YARN-5910: Related: 3.0.0-alpha1 added 'hadoop dtutil' and the hadoop.token.files property. Between the two of them, it's very possible for end users to provide multiple DTs for multiple (and unrelated) clusters at job submission time. > Support for multi-cluster delegation tokens > --- > > Key: YARN-5910 > URL: https://issues.apache.org/jira/browse/YARN-5910 > Project: Hadoop YARN > Issue Type: New Feature > Components: security >Reporter: Clay B. >Priority: Minor > > As an administrator running many secure (kerberized) clusters, some which > have peer clusters managed by other teams, I am looking for a way to run jobs > which may require services running on other clusters. Particular cases where > this rears itself are running something as core as a distcp between two > kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp > hdfs://LOCALCLUSTER/user/user292/test.out > hdfs://REMOTECLUSTER/user/user292/test.out.result}}). > Thanks to YARN-3021, once can run for a while but if the delegation token for > the remote cluster needs renewal the job will fail[1]. One can pre-configure > their {{hdfs-site.xml}} loaded by the YARN RM to know of all possible HDFSes > available but that requires coordination that is not always feasible, > especially as a cluster's peers grow into the tens of clusters or across > management teams. Ideally, one could have core systems configured this way > but jobs could also specify their own handling of tokens and management when > needed? > [1]: Example stack trace when the RM is unaware of a remote service: > > {code} > 2016-03-23 14:59:50,528 INFO > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > application_1458441356031_3317 found existing hdfs token Kind: > HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER, Ident: > (HDFS_DELEGATION_TOKEN token > 10927 for user292) > 2016-03-23 14:59:50,557 WARN > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: > Unable to add the application to the delegation token renewer. > java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token 10927 for > user292) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Unable to map logical nameservice URI > 'hdfs://REMOTECLUSTER' to a NameNode. Local configuration does not have a > failover proxy provider configured. > at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164) > at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128) > at org.apache.hadoop.security.token.Token.renew(Token.java:377) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org