[ 
https://issues.apache.org/jira/browse/SPARK-25689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780609#comment-16780609
 ] 

Yuming Wang commented on SPARK-25689:
-------------------------------------

Create PR: [https://github.com/apache/spark/pull/23922] to fix below issue:
{noformat}
19/02/28 06:13:35 ERROR YarnClientSchedulerBackend: The YARN application has 
already ended! It might have been killed or the Application Master may have 
failed to start. Check the YARN application logs for more details.
19/02/28 06:13:35 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Application application_1551247994025_11764 
failed 2 times due to AM Container for appattempt_1551247994025_11764_000002 
exited with  exitCode: -1000
For more detailed output, check application tracking 
page:https://resource-manager:50030/cluster/app/application_1551247994025_11764Then,
 click on links to logs of each attempt.
Diagnostics: Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]; Host Details : local host is: "node1/10.210.18.72"; 
destination host is: "namenode":8020;
java.io.IOException: Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]; Host Details : local host is: "node1/10.210.18.72"; 
destination host is: "namenode":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:785)
        at org.apache.hadoop.ipc.Client.call(Client.java:1430)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:690)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:653)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:740)
        at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
        at org.apache.hadoop.ipc.Client.call(Client.java:1402)
        ... 31 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
        at 
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
        at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
        at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
        at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
        ... 34 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS]
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:690)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:653)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:740)
        at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
        at org.apache.hadoop.ipc.Client.call(Client.java:1402)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
        at 
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
        at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
        at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
        at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
        ... 34 more
Caused by: Client cannot authenticate via:[TOKEN, KERBEROS]
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
        at 
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
        at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
        at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
        at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
        at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
        at org.apache.hadoop.ipc.Client.call(Client.java:1402)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
        at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Failing this attempt. Failing the application.
        at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:95)
        at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
        at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:185)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2467)
        at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$5(SparkSession.scala:948)
        at scala.Option.getOrElse(Option.scala:138)
        at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:48)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:315)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:168)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:196)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:87)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:932)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:941)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
{noformat}

> Move token renewal logic to driver in yarn-client mode
> ------------------------------------------------------
>
>                 Key: SPARK-25689
>                 URL: https://issues.apache.org/jira/browse/SPARK-25689
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 2.4.0
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> Currently, both in yarn-cluster and yarn-client mode, the YARN AM is 
> responsible for renewing delegation tokens. That differs from other RMs 
> (Mesos and later k8s when it supports this functionality), and is one of the 
> roadblocks towards fully sharing the same delegation token-related code.
> We should look at keeping the renewal logic within the driver in yarn-client 
> mode. That would also remove the need to distribute the user's keytab to the 
> AM when running in that particular mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to