Re: Creating new Spark context when running in Secure YARN fails
Feel free to correct me if I am wrong. But I believe this isn't a feature yet: "create a new Spark context within a single JVM process (driver)" A few questions for you: 1) Is Kerberos setup correctly for you (the user) 2) Could you please add the command/ code you are executing? Checking to see if you provide a keytab and principal in your invocation. - Neelesh S. Salian Cloudera -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Creating-new-Spark-context-when-running-in-Secure-YARN-fails-tp25361p26873.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Creating new Spark context when running in Secure YARN fails
Can you try this: https://github.com/apache/spark/pull/9875 <https://github.com/apache/spark/pull/9875>. I believe this patch should fix the issue here. Thanks, Hari Shreedharan > On Nov 11, 2015, at 1:59 PM, Ted Yu wrote: > > Please take a look at > yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala > where this config is described > > Cheers > > On Wed, Nov 11, 2015 at 1:45 PM, Michael V Le <mailto:m...@us.ibm.com>> wrote: > It looks like my config does not have "spark.yarn.credentials.file". > > I executed: > sc._conf.getAll() > > [(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled', > u'true'), (u'spark.ssl.keyStorePassword', u'XXX'), (u'spark.yarn.principal', > u'XXX'), (u'spark.master', u'yarn-client'), (u'spark.ssl.keyPassword', > u'XXX'), (u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'), > (u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol', > u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'), > (u'spark.app.name <http://spark.app.name/>', u'PySparkShell'), > (u'spark.yarn.keytab', u'XXX.keytab'), (u'spark.yarn.historyServer.address', > u'xxx-001:18080'), (u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir', > u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), > (u'spark.ssl.enabledAlgorithms', > u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'), > (u'spark.serializer.objectStreamReset', u'100'), > (u'spark.history.fs.logDirectory', > u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython', > u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled', > u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore', > u'xxx.truststore')] > > I am not really familiar with "spark.yarn.credentials.file" and had thought > it was created automatically after communicating with YARN to get tokens. > > Thanks, > Mike > > > Ted Yu ---11/11/2015 03:35:41 PM---I assume your config contains > "spark.yarn.credentials.file" - otherwise startExecutorDelegationToken > > From: Ted Yu mailto:yuzhih...@gmail.com>> > To: Michael V Le/Watson/IBM@IBMUS > Cc: user mailto:user@spark.apache.org>> > Date: 11/11/2015 03:35 PM > Subject: Re: Creating new Spark context when running in Secure YARN fails > > > > > I assume your config contains "spark.yarn.credentials.file" - otherwise > startExecutorDelegationTokenRenewer(conf) call would be skipped. > > On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le <mailto:m...@us.ibm.com>> wrote: > Hi Ted, > > Thanks for reply. > > I tried your patch but am having the same problem. > > I ran: > > ./bin/pyspark --master yarn-client > > >> sc.stop() > >> sc = SparkContext() > > Same error dump as below. > > Do I need to pass something to the new sparkcontext ? > > Thanks, > Mike > > Ted Yu ---11/11/2015 01:55:02 PM---Looks like the delegation > token should be renewed. Mind trying the following ? > > From: Ted Yu mailto:yuzhih...@gmail.com>> > To: Michael V Le/Watson/IBM@IBMUS > Cc: user mailto:user@spark.apache.org>> > Date: 11/11/2015 01:55 PM > Subject: Re: Creating new Spark context when running in Secure YARN fails > > > > > Looks like the delegation token should be renewed. > > Mind trying the following ? > > Thanks > > diff --git > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB > index 20771f6..e3c4a5a 100644 > --- > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > +++ > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( > logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" ")) > val args = new ClientArguments(argsArrayBuf.toArray, conf) > totalExpectedExecutors = args.numExecutors > +// SPARK-8851: In yarn-client mode, the AM still does the credentials > refresh. The driver > +// re
Re: Creating new Spark context when running in Secure YARN fails
Please take a look at yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala where this config is described Cheers On Wed, Nov 11, 2015 at 1:45 PM, Michael V Le wrote: > It looks like my config does not have "spark.yarn.credentials.file". > > I executed: > sc._conf.getAll() > > [(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled', > u'true'), (u'spark.ssl.keyStorePassword', u'XXX'), > (u'spark.yarn.principal', u'XXX'), (u'spark.master', u'yarn-client'), > (u'spark.ssl.keyPassword', u'XXX'), > (u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'), > (u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol', > u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'), (u' > spark.app.name', u'PySparkShell'), (u'spark.yarn.keytab', u'XXX.keytab'), > (u'spark.yarn.historyServer.address', u'xxx-001:18080'), > (u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir', > u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), > (u'spark.ssl.enabledAlgorithms', > u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'), > (u'spark.serializer.objectStreamReset', u'100'), > (u'spark.history.fs.logDirectory', > u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython', > u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled', > u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore', > u'xxx.truststore')] > > I am not really familiar with "spark.yarn.credentials.file" and had > thought it was created automatically after communicating with YARN to get > tokens. > > Thanks, > Mike > > > [image: Inactive hide details for Ted Yu ---11/11/2015 03:35:41 PM---I > assume your config contains "spark.yarn.credentials.file" - othe]Ted Yu > ---11/11/2015 03:35:41 PM---I assume your config contains > "spark.yarn.credentials.file" - otherwise startExecutorDelegationToken > > From: Ted Yu > To: Michael V Le/Watson/IBM@IBMUS > Cc: user > Date: 11/11/2015 03:35 PM > Subject: Re: Creating new Spark context when running in Secure YARN fails > -- > > > > I assume your config contains "spark.yarn.credentials.file" - > otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped. > > On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le <*m...@us.ibm.com* > > wrote: > >Hi Ted, > >Thanks for reply. > >I tried your patch but am having the same problem. > > I ran: > >./bin/pyspark --master yarn-client > >>> sc.stop() >>> sc = SparkContext() > >Same error dump as below. > >Do I need to pass something to the new sparkcontext ? > >Thanks, >Mike > >[image: Inactive hide details for Ted Yu ---11/11/2015 01:55:02 >PM---Looks like the delegation token should be renewed. Mind trying the]Ted >Yu ---11/11/2015 01:55:02 PM---Looks like the delegation token should be >renewed. Mind trying the following ? > >From: Ted Yu <*yuzhih...@gmail.com* > >To: Michael V Le/Watson/IBM@IBMUS >Cc: user <*user@spark.apache.org* > >Date: 11/11/2015 01:55 PM >Subject: Re: Creating new Spark context when running in Secure YARN >fails >-- > > > > >Looks like the delegation token should be renewed. > >Mind trying the following ? > >Thanks > >diff --git > > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB >index 20771f6..e3c4a5a 100644 >--- > > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala >+++ > > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala >@@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( > logDebug("ClientArguments called with: " + >argsArrayBuf.mkString(" ")) > val args = new ClientArguments(argsArrayBuf.toArray, conf) > totalExpectedExecutors = args.numExecutors >+// SPARK-8851: In yarn-client mode, the AM s
Re: Creating new Spark context when running in Secure YARN fails
It looks like my config does not have "spark.yarn.credentials.file". I executed: sc._conf.getAll() [(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled', u'true'), (u'spark.ssl.keyStorePassword', u'XXX'), (u'spark.yarn.principal', u'XXX'), (u'spark.master', u'yarn-client'), (u'spark.ssl.keyPassword', u'XXX'), (u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'), (u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol', u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'), (u'spark.app.name', u'PySparkShell'), (u'spark.yarn.keytab', u'XXX.keytab'), (u'spark.yarn.historyServer.address', u'xxx-001:18080'), (u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir', u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.ssl.enabledAlgorithms', u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'), (u'spark.serializer.objectStreamReset', u'100'), (u'spark.history.fs.logDirectory', u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython', u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled', u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore', u'xxx.truststore')] I am not really familiar with "spark.yarn.credentials.file" and had thought it was created automatically after communicating with YARN to get tokens. Thanks, Mike From: Ted Yu To: Michael V Le/Watson/IBM@IBMUS Cc: user Date: 11/11/2015 03:35 PM Subject:Re: Creating new Spark context when running in Secure YARN fails I assume your config contains "spark.yarn.credentials.file" - otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped. On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le wrote: Hi Ted, Thanks for reply. I tried your patch but am having the same problem. I ran: ./bin/pyspark --master yarn-client >> sc.stop() >> sc = SparkContext() Same error dump as below. Do I need to pass something to the new sparkcontext ? Thanks, Mike Inactive hide details for Ted Yu ---11/11/2015 01:55:02 PM---Looks like the delegation token should be renewed. Mind trying theTed Yu ---11/11/2015 01:55:02 PM---Looks like the delegation token should be renewed. Mind trying the following ? From: Ted Yu To: Michael V Le/Watson/IBM@IBMUS Cc: user Date: 11/11/2015 01:55 PM Subject: Re: Creating new Spark context when running in Secure YARN fails Looks like the delegation token should be renewed. Mind trying the following ? Thanks diff --git a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB index 20771f6..e3c4a5a 100644 --- a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala +++ b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" ")) val args = new ClientArguments(argsArrayBuf.toArray, conf) totalExpectedExecutors = args.numExecutors + // SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver + // reads the credentials from HDFS, just like the executors and updates its own credentials + // cache. + if (conf.contains("spark.yarn.credentials.file")) { + YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) + } client = new Client(args, conf) appId = client.submitApplication() @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend( waitForApplication() - // SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver - // reads the credentials from HDFS, just like the executors and updates its own credentials - // cache. - if (conf.contains("spark.yarn.credentials.file")) { - YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) - } monitorThread = asyncMonitorApplication() monitorThread.start() } On Wed, Nov 11, 2015 at 10:23 AM, mvle wrote: Hi, I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am trying to run the pyspark shell using Spark 1.5.1 pyspark shell wor
Re: Creating new Spark context when running in Secure YARN fails
I assume your config contains "spark.yarn.credentials.file" - otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped. On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le wrote: > Hi Ted, > > Thanks for reply. > > I tried your patch but am having the same problem. > > I ran: > > ./bin/pyspark --master yarn-client > > >> sc.stop() > >> sc = SparkContext() > > Same error dump as below. > > Do I need to pass something to the new sparkcontext ? > > Thanks, > Mike > > [image: Inactive hide details for Ted Yu ---11/11/2015 01:55:02 PM---Looks > like the delegation token should be renewed. Mind trying the]Ted Yu > ---11/11/2015 01:55:02 PM---Looks like the delegation token should be > renewed. Mind trying the following ? > > From: Ted Yu > To: Michael V Le/Watson/IBM@IBMUS > Cc: user > Date: 11/11/2015 01:55 PM > Subject: Re: Creating new Spark context when running in Secure YARN fails > -- > > > > Looks like the delegation token should be renewed. > > Mind trying the following ? > > Thanks > > diff --git > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB > index 20771f6..e3c4a5a 100644 > --- > a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > +++ > b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala > @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( > logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" ")) > val args = new ClientArguments(argsArrayBuf.toArray, conf) > totalExpectedExecutors = args.numExecutors > +// SPARK-8851: In yarn-client mode, the AM still does the credentials > refresh. The driver > +// reads the credentials from HDFS, just like the executors and > updates its own credentials > +// cache. > +if (conf.contains("spark.yarn.credentials.file")) { > + YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) > +} > client = new Client(args, conf) > appId = client.submitApplication() > > @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend( > > waitForApplication() > > -// SPARK-8851: In yarn-client mode, the AM still does the credentials > refresh. The driver > -// reads the credentials from HDFS, just like the executors and > updates its own credentials > -// cache. > -if (conf.contains("spark.yarn.credentials.file")) { > - YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) > -} > monitorThread = asyncMonitorApplication() > monitorThread.start() >} > > On Wed, Nov 11, 2015 at 10:23 AM, mvle <*m...@us.ibm.com* > > wrote: > >Hi, > >I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am >trying >to run the pyspark shell using Spark 1.5.1 > >pyspark shell works and I can run a sample code to calculate PI just >fine. >However, when I try to stop the current context (e.g., sc.stop()) and >then >create a new context (sc = SparkContext()), I get the error below. > >I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token >42 for >hadoop) can't be found in cache", > >Does anyone know if it is possible to stop and create a new Spark >context >within a single JVM process (driver) and have that work when dealing >with >delegation tokens from Secure YARN/HDFS? > >Thanks. > >15/11/11 10:19:53 INFO yarn.Client: Setting up container launch >context for >our AM >15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment >for >our AM container >15/11/11 10:19:53 INFO yarn.Client: Credentials file set to: >credentials-37915c3e-1e90-44b9-add1-521598cea846 >15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for >namenode: > > > hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042 >15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing >SparkContext. >org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation >Token >can be issued only with kerberos or web authentication >at > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6638) >at > > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegat
Re: Creating new Spark context when running in Secure YARN fails
Hi Ted, Thanks for reply. I tried your patch but am having the same problem. I ran: ./bin/pyspark --master yarn-client >> sc.stop() >> sc = SparkContext() Same error dump as below. Do I need to pass something to the new sparkcontext ? Thanks, Mike From: Ted Yu To: Michael V Le/Watson/IBM@IBMUS Cc: user Date: 11/11/2015 01:55 PM Subject:Re: Creating new Spark context when running in Secure YARN fails Looks like the delegation token should be renewed. Mind trying the following ? Thanks diff --git a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB index 20771f6..e3c4a5a 100644 --- a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala +++ b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" ")) val args = new ClientArguments(argsArrayBuf.toArray, conf) totalExpectedExecutors = args.numExecutors + // SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver + // reads the credentials from HDFS, just like the executors and updates its own credentials + // cache. + if (conf.contains("spark.yarn.credentials.file")) { + YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) + } client = new Client(args, conf) appId = client.submitApplication() @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend( waitForApplication() - // SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver - // reads the credentials from HDFS, just like the executors and updates its own credentials - // cache. - if (conf.contains("spark.yarn.credentials.file")) { - YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) - } monitorThread = asyncMonitorApplication() monitorThread.start() } On Wed, Nov 11, 2015 at 10:23 AM, mvle wrote: Hi, I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am trying to run the pyspark shell using Spark 1.5.1 pyspark shell works and I can run a sample code to calculate PI just fine. However, when I try to stop the current context (e.g., sc.stop()) and then create a new context (sc = SparkContext()), I get the error below. I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token 42 for hadoop) can't be found in cache", Does anyone know if it is possible to stop and create a new Spark context within a single JVM process (driver) and have that work when dealing with delegation tokens from Secure YARN/HDFS? Thanks. 15/11/11 10:19:53 INFO yarn.Client: Setting up container launch context for our AM 15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment for our AM container 15/11/11 10:19:53 INFO yarn.Client: Credentials file set to: credentials-37915c3e-1e90-44b9-add1-521598cea846 15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042 15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing SparkContext. org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken (FSNamesystem.java:6638) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken (NameNodeRpcServer.java:563) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken (ClientNamenodeProtocolServerSideTranslatorPB.java:987) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos $ClientNamenodeProtocol$2.callBlockingMethod (ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call (ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:140
Re: Creating new Spark context when running in Secure YARN fails
Looks like the delegation token should be renewed. Mind trying the following ? Thanks diff --git a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB index 20771f6..e3c4a5a 100644 --- a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala +++ b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend( logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" ")) val args = new ClientArguments(argsArrayBuf.toArray, conf) totalExpectedExecutors = args.numExecutors +// SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver +// reads the credentials from HDFS, just like the executors and updates its own credentials +// cache. +if (conf.contains("spark.yarn.credentials.file")) { + YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) +} client = new Client(args, conf) appId = client.submitApplication() @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend( waitForApplication() -// SPARK-8851: In yarn-client mode, the AM still does the credentials refresh. The driver -// reads the credentials from HDFS, just like the executors and updates its own credentials -// cache. -if (conf.contains("spark.yarn.credentials.file")) { - YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf) -} monitorThread = asyncMonitorApplication() monitorThread.start() } On Wed, Nov 11, 2015 at 10:23 AM, mvle wrote: > Hi, > > I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am > trying > to run the pyspark shell using Spark 1.5.1 > > pyspark shell works and I can run a sample code to calculate PI just fine. > However, when I try to stop the current context (e.g., sc.stop()) and then > create a new context (sc = SparkContext()), I get the error below. > > I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token 42 for > hadoop) can't be found in cache", > > Does anyone know if it is possible to stop and create a new Spark context > within a single JVM process (driver) and have that work when dealing with > delegation tokens from Secure YARN/HDFS? > > Thanks. > > 15/11/11 10:19:53 INFO yarn.Client: Setting up container launch context for > our AM > 15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment for > our AM container > 15/11/11 10:19:53 INFO yarn.Client: Credentials file set to: > credentials-37915c3e-1e90-44b9-add1-521598cea846 > 15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for > namenode: > > hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042 > 15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing > SparkContext. > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation > Token > can be issued only with kerberos or web authentication > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6638) > at > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:563) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:987) > at > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1407) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy12.getDelegationToken(Unknown Source) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:933) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(