Re: Creating new Spark context when running in Secure YARN fails

2016-05-03 Thread nsalian
Feel free to correct me if I am wrong.
But I believe this isn't a feature yet:
 "create a new Spark context within a single JVM process (driver)"

A few questions for you:

1) Is Kerberos setup correctly for you (the user)
2) Could you please add the command/ code you are executing?
Checking to see if you provide a keytab and principal in your invocation.



-
Neelesh S. Salian
Cloudera
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Creating-new-Spark-context-when-running-in-Secure-YARN-fails-tp25361p26873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Creating new Spark context when running in Secure YARN fails

2015-11-20 Thread Hari Shreedharan
Can you try this: https://github.com/apache/spark/pull/9875 
<https://github.com/apache/spark/pull/9875>. I believe this patch should fix 
the issue here.

Thanks,
Hari Shreedharan




> On Nov 11, 2015, at 1:59 PM, Ted Yu  wrote:
> 
> Please take a look at 
> yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
>  where this config is described
> 
> Cheers
> 
> On Wed, Nov 11, 2015 at 1:45 PM, Michael V Le  <mailto:m...@us.ibm.com>> wrote:
> It looks like my config does not have "spark.yarn.credentials.file".
> 
> I executed:
> sc._conf.getAll()
> 
> [(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled', 
> u'true'), (u'spark.ssl.keyStorePassword', u'XXX'), (u'spark.yarn.principal', 
> u'XXX'), (u'spark.master', u'yarn-client'), (u'spark.ssl.keyPassword', 
> u'XXX'), (u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'), 
> (u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol', 
> u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'), 
> (u'spark.app.name <http://spark.app.name/>', u'PySparkShell'), 
> (u'spark.yarn.keytab', u'XXX.keytab'), (u'spark.yarn.historyServer.address', 
> u'xxx-001:18080'), (u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir', 
> u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), 
> (u'spark.ssl.enabledAlgorithms', 
> u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'), 
> (u'spark.serializer.objectStreamReset', u'100'), 
> (u'spark.history.fs.logDirectory', 
> u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython', 
> u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled', 
> u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore', 
> u'xxx.truststore')]
> 
> I am not really familiar with "spark.yarn.credentials.file" and had thought 
> it was created automatically after communicating with YARN to get tokens.
> 
> Thanks,
> Mike
> 
> 
> Ted Yu ---11/11/2015 03:35:41 PM---I assume your config contains 
> "spark.yarn.credentials.file" - otherwise startExecutorDelegationToken
> 
> From: Ted Yu mailto:yuzhih...@gmail.com>>
> To: Michael V Le/Watson/IBM@IBMUS
> Cc: user mailto:user@spark.apache.org>>
> Date: 11/11/2015 03:35 PM
> Subject: Re: Creating new Spark context when running in Secure YARN fails
> 
> 
> 
> 
> I assume your config contains "spark.yarn.credentials.file" - otherwise 
> startExecutorDelegationTokenRenewer(conf) call would be skipped.
> 
> On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le  <mailto:m...@us.ibm.com>> wrote:
> Hi Ted,
> 
> Thanks for reply.
> 
> I tried your patch but am having the same problem.
> 
> I ran:
> 
> ./bin/pyspark --master yarn-client
> 
> >> sc.stop()
> >> sc = SparkContext()
> 
> Same error dump as below.
> 
> Do I need to pass something to the new sparkcontext ?
> 
> Thanks,
> Mike
> 
> Ted Yu ---11/11/2015 01:55:02 PM---Looks like the delegation 
> token should be renewed. Mind trying the following ?
> 
> From: Ted Yu mailto:yuzhih...@gmail.com>>
> To: Michael V Le/Watson/IBM@IBMUS
> Cc: user mailto:user@spark.apache.org>>
> Date: 11/11/2015 01:55 PM
> Subject: Re: Creating new Spark context when running in Secure YARN fails
> 
> 
> 
> 
> Looks like the delegation token should be renewed.
> 
> Mind trying the following ?
> 
> Thanks
> 
> diff --git 
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
>  b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
> index 20771f6..e3c4a5a 100644
> --- 
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
> +++ 
> b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
> @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
>  logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" "))
>  val args = new ClientArguments(argsArrayBuf.toArray, conf)
>  totalExpectedExecutors = args.numExecutors
> +// SPARK-8851: In yarn-client mode, the AM still does the credentials 
> refresh. The driver
> +// re

Re: Creating new Spark context when running in Secure YARN fails

2015-11-11 Thread Ted Yu
Please take a look at
yarn/src/main/scala/org/apache/spark/deploy/yarn/AMDelegationTokenRenewer.scala
where this config is described

Cheers

On Wed, Nov 11, 2015 at 1:45 PM, Michael V Le  wrote:

> It looks like my config does not have "spark.yarn.credentials.file".
>
> I executed:
> sc._conf.getAll()
>
> [(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled',
> u'true'), (u'spark.ssl.keyStorePassword', u'XXX'),
> (u'spark.yarn.principal', u'XXX'), (u'spark.master', u'yarn-client'),
> (u'spark.ssl.keyPassword', u'XXX'),
> (u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'),
> (u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol',
> u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'), (u'
> spark.app.name', u'PySparkShell'), (u'spark.yarn.keytab', u'XXX.keytab'),
> (u'spark.yarn.historyServer.address', u'xxx-001:18080'),
> (u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir',
> u'hdfs://xxx-001:9000/user/hadoop/sparklogs'),
> (u'spark.ssl.enabledAlgorithms',
> u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'),
> (u'spark.serializer.objectStreamReset', u'100'),
> (u'spark.history.fs.logDirectory',
> u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython',
> u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled',
> u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore',
> u'xxx.truststore')]
>
> I am not really familiar with "spark.yarn.credentials.file" and had
> thought it was created automatically after communicating with YARN to get
> tokens.
>
> Thanks,
> Mike
>
>
> [image: Inactive hide details for Ted Yu ---11/11/2015 03:35:41 PM---I
> assume your config contains "spark.yarn.credentials.file" - othe]Ted Yu
> ---11/11/2015 03:35:41 PM---I assume your config contains
> "spark.yarn.credentials.file" - otherwise startExecutorDelegationToken
>
> From: Ted Yu 
> To: Michael V Le/Watson/IBM@IBMUS
> Cc: user 
> Date: 11/11/2015 03:35 PM
> Subject: Re: Creating new Spark context when running in Secure YARN fails
> --
>
>
>
> I assume your config contains "spark.yarn.credentials.file" -
> otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped.
>
> On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le <*m...@us.ibm.com*
> > wrote:
>
>Hi Ted,
>
>Thanks for reply.
>
>I tried your patch but am having the same problem.
>
>    I ran:
>
>./bin/pyspark --master yarn-client
>
>>> sc.stop()
>>> sc = SparkContext()
>
>Same error dump as below.
>
>Do I need to pass something to the new sparkcontext ?
>
>Thanks,
>Mike
>
>[image: Inactive hide details for Ted Yu ---11/11/2015 01:55:02
>PM---Looks like the delegation token should be renewed. Mind trying the]Ted
>Yu ---11/11/2015 01:55:02 PM---Looks like the delegation token should be
>renewed. Mind trying the following ?
>
>From: Ted Yu <*yuzhih...@gmail.com* >
>To: Michael V Le/Watson/IBM@IBMUS
>Cc: user <*user@spark.apache.org* >
>Date: 11/11/2015 01:55 PM
>Subject: Re: Creating new Spark context when running in Secure YARN
>fails
>--
>
>
>
>
>Looks like the delegation token should be renewed.
>
>Mind trying the following ?
>
>Thanks
>
>diff --git
>
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
>
> b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
>index 20771f6..e3c4a5a 100644
>---
>
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
>+++
>
> b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
>@@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
> logDebug("ClientArguments called with: " +
>argsArrayBuf.mkString(" "))
> val args = new ClientArguments(argsArrayBuf.toArray, conf)
> totalExpectedExecutors = args.numExecutors
>+// SPARK-8851: In yarn-client mode, the AM s

Re: Creating new Spark context when running in Secure YARN fails

2015-11-11 Thread Michael V Le

It looks like my config does not have "spark.yarn.credentials.file".

I executed:
sc._conf.getAll()

[(u'spark.ssl.keyStore', u'xxx.keystore'), (u'spark.eventLog.enabled',
u'true'), (u'spark.ssl.keyStorePassword', u'XXX'),
(u'spark.yarn.principal', u'XXX'), (u'spark.master', u'yarn-client'),
(u'spark.ssl.keyPassword', u'XXX'),
(u'spark.authenticate.sasl.serverAlwaysEncrypt', u'true'),
(u'spark.ssl.trustStorePassword', u'XXX'), (u'spark.ssl.protocol',
u'TLSv1.2'), (u'spark.authenticate.enableSaslEncryption', u'true'),
(u'spark.app.name', u'PySparkShell'), (u'spark.yarn.keytab',
u'XXX.keytab'), (u'spark.yarn.historyServer.address', u'xxx-001:18080'),
(u'spark.rdd.compress', u'True'), (u'spark.eventLog.dir',
u'hdfs://xxx-001:9000/user/hadoop/sparklogs'),
(u'spark.ssl.enabledAlgorithms',
u'TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'),
(u'spark.serializer.objectStreamReset', u'100'),
(u'spark.history.fs.logDirectory',
u'hdfs://xxx-001:9000/user/hadoop/sparklogs'), (u'spark.yarn.isPython',
u'true'), (u'spark.submit.deployMode', u'client'), (u'spark.ssl.enabled',
u'true'), (u'spark.authenticate', u'true'), (u'spark.ssl.trustStore',
u'xxx.truststore')]

I am not really familiar with "spark.yarn.credentials.file" and had thought
it was created automatically after communicating with YARN to get tokens.

Thanks,
Mike




From:   Ted Yu 
To: Michael V Le/Watson/IBM@IBMUS
Cc: user 
Date:   11/11/2015 03:35 PM
Subject:Re: Creating new Spark context when running in Secure YARN
fails



I assume your config contains "spark.yarn.credentials.file" -
otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped.

On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le  wrote:
  Hi Ted,

  Thanks for reply.

  I tried your patch but am having the same problem.

  I ran:

  ./bin/pyspark --master yarn-client

  >> sc.stop()
  >> sc = SparkContext()

  Same error dump as below.

  Do I need to pass something to the new sparkcontext ?

  Thanks,
  Mike

  Inactive hide details for Ted Yu ---11/11/2015 01:55:02 PM---Looks like
  the delegation token should be renewed. Mind trying theTed Yu
  ---11/11/2015 01:55:02 PM---Looks like the delegation token should be
  renewed. Mind trying the following ?

  From: Ted Yu 
  To: Michael V Le/Watson/IBM@IBMUS
  Cc: user 
  Date: 11/11/2015 01:55 PM
  Subject: Re: Creating new Spark context when running in Secure YARN fails




  Looks like the delegation token should be renewed.

  Mind trying the following ?

  Thanks

  diff --git
  
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
 b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB

  index 20771f6..e3c4a5a 100644
  ---
  
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

  +++
  
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

  @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
       logDebug("ClientArguments called with: " + argsArrayBuf.mkString("
  "))
       val args = new ClientArguments(argsArrayBuf.toArray, conf)
       totalExpectedExecutors = args.numExecutors
  +    // SPARK-8851: In yarn-client mode, the AM still does the
  credentials refresh. The driver
  +    // reads the credentials from HDFS, just like the executors and
  updates its own credentials
  +    // cache.
  +    if (conf.contains("spark.yarn.credentials.file")) {
  +      YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
  +    }
       client = new Client(args, conf)
       appId = client.submitApplication()

  @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend(

       waitForApplication()

  -    // SPARK-8851: In yarn-client mode, the AM still does the
  credentials refresh. The driver
  -    // reads the credentials from HDFS, just like the executors and
  updates its own credentials
  -    // cache.
  -    if (conf.contains("spark.yarn.credentials.file")) {
  -      YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
  -    }
       monitorThread = asyncMonitorApplication()
       monitorThread.start()
     }

  On Wed, Nov 11, 2015 at 10:23 AM, mvle  wrote:
Hi,

I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and
am trying
to run the pyspark shell using Spark 1.5.1

pyspark shell wor

Re: Creating new Spark context when running in Secure YARN fails

2015-11-11 Thread Ted Yu
I assume your config contains "spark.yarn.credentials.file" -
otherwise startExecutorDelegationTokenRenewer(conf) call would be skipped.

On Wed, Nov 11, 2015 at 12:16 PM, Michael V Le  wrote:

> Hi Ted,
>
> Thanks for reply.
>
> I tried your patch but am having the same problem.
>
> I ran:
>
> ./bin/pyspark --master yarn-client
>
> >> sc.stop()
> >> sc = SparkContext()
>
> Same error dump as below.
>
> Do I need to pass something to the new sparkcontext ?
>
> Thanks,
> Mike
>
> [image: Inactive hide details for Ted Yu ---11/11/2015 01:55:02 PM---Looks
> like the delegation token should be renewed. Mind trying the]Ted Yu
> ---11/11/2015 01:55:02 PM---Looks like the delegation token should be
> renewed. Mind trying the following ?
>
> From: Ted Yu 
> To: Michael V Le/Watson/IBM@IBMUS
> Cc: user 
> Date: 11/11/2015 01:55 PM
> Subject: Re: Creating new Spark context when running in Secure YARN fails
> --
>
>
>
> Looks like the delegation token should be renewed.
>
> Mind trying the following ?
>
> Thanks
>
> diff --git
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
> b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
> index 20771f6..e3c4a5a 100644
> ---
> a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
> +++
> b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
> @@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
>  logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" "))
>  val args = new ClientArguments(argsArrayBuf.toArray, conf)
>  totalExpectedExecutors = args.numExecutors
> +// SPARK-8851: In yarn-client mode, the AM still does the credentials
> refresh. The driver
> +// reads the credentials from HDFS, just like the executors and
> updates its own credentials
> +// cache.
> +if (conf.contains("spark.yarn.credentials.file")) {
> +  YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
> +}
>  client = new Client(args, conf)
>  appId = client.submitApplication()
>
> @@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend(
>
>  waitForApplication()
>
> -// SPARK-8851: In yarn-client mode, the AM still does the credentials
> refresh. The driver
> -// reads the credentials from HDFS, just like the executors and
> updates its own credentials
> -// cache.
> -if (conf.contains("spark.yarn.credentials.file")) {
> -  YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
> -}
>  monitorThread = asyncMonitorApplication()
>  monitorThread.start()
>}
>
> On Wed, Nov 11, 2015 at 10:23 AM, mvle <*m...@us.ibm.com*
> > wrote:
>
>Hi,
>
>I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am
>trying
>to run the pyspark shell using Spark 1.5.1
>
>pyspark shell works and I can run a sample code to calculate PI just
>fine.
>However, when I try to stop the current context (e.g., sc.stop()) and
>then
>create a new context (sc = SparkContext()), I get the error below.
>
>I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token
>42 for
>hadoop) can't be found in cache",
>
>Does anyone know if it is possible to stop and create a new Spark
>context
>within a single JVM process (driver) and have that work when dealing
>with
>delegation tokens from Secure YARN/HDFS?
>
>Thanks.
>
>15/11/11 10:19:53 INFO yarn.Client: Setting up container launch
>context for
>our AM
>15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment
>for
>our AM container
>15/11/11 10:19:53 INFO yarn.Client: Credentials file set to:
>credentials-37915c3e-1e90-44b9-add1-521598cea846
>15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for
>namenode:
>
>
> hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042
>15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing
>SparkContext.
>org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation
>Token
>can be issued only with kerberos or web authentication
>at
>
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6638)
>at
>
>
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegat

Re: Creating new Spark context when running in Secure YARN fails

2015-11-11 Thread Michael V Le

Hi Ted,

Thanks for reply.

I tried your patch but am having the same problem.

I ran:

./bin/pyspark --master yarn-client

>> sc.stop()
>> sc = SparkContext()

Same error dump as below.

Do I need to pass something to the new sparkcontext ?

Thanks,
Mike



From:   Ted Yu 
To: Michael V Le/Watson/IBM@IBMUS
Cc: user 
Date:   11/11/2015 01:55 PM
Subject:Re: Creating new Spark context when running in Secure YARN
    fails



Looks like the delegation token should be renewed.

Mind trying the following ?

Thanks

diff --git
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
index 20771f6..e3c4a5a 100644
---
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
+++
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
@@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
     logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" "))
     val args = new ClientArguments(argsArrayBuf.toArray, conf)
     totalExpectedExecutors = args.numExecutors
+    // SPARK-8851: In yarn-client mode, the AM still does the credentials
refresh. The driver
+    // reads the credentials from HDFS, just like the executors and
updates its own credentials
+    // cache.
+    if (conf.contains("spark.yarn.credentials.file")) {
+      YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
+    }
     client = new Client(args, conf)
     appId = client.submitApplication()

@@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend(

     waitForApplication()

-    // SPARK-8851: In yarn-client mode, the AM still does the credentials
refresh. The driver
-    // reads the credentials from HDFS, just like the executors and
updates its own credentials
-    // cache.
-    if (conf.contains("spark.yarn.credentials.file")) {
-      YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
-    }
     monitorThread = asyncMonitorApplication()
     monitorThread.start()
   }

On Wed, Nov 11, 2015 at 10:23 AM, mvle  wrote:
  Hi,

  I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am
  trying
  to run the pyspark shell using Spark 1.5.1

  pyspark shell works and I can run a sample code to calculate PI just
  fine.
  However, when I try to stop the current context (e.g., sc.stop()) and
  then
  create a new context (sc = SparkContext()), I get the error below.

  I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token 42
  for
  hadoop) can't be found in cache",

  Does anyone know if it is possible to stop and create a new Spark context
  within a single JVM process (driver) and have that work when dealing with
  delegation tokens from Secure YARN/HDFS?

  Thanks.

  15/11/11 10:19:53 INFO yarn.Client: Setting up container launch context
  for
  our AM
  15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment for
  our AM container
  15/11/11 10:19:53 INFO yarn.Client: Credentials file set to:
  credentials-37915c3e-1e90-44b9-add1-521598cea846
  15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for
  namenode:
  
hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042

  15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing
  SparkContext.
  org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation
  Token
  can be issued only with kerberos or web authentication
          at
  org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken
  (FSNamesystem.java:6638)
          at
  org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken
(NameNodeRpcServer.java:563)
          at
  
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken
(ClientNamenodeProtocolServerSideTranslatorPB.java:987)
          at
  org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos
  $ClientNamenodeProtocol$2.callBlockingMethod
  (ClientNamenodeProtocolProtos.java)
          at
  org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call
  (ProtobufRpcEngine.java:616)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:415)
          at
  org.apache.hadoop.security.UserGroupInformation.doAs
  (UserGroupInformation.java:1657)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

          at org.apache.hadoop.ipc.Client.call(Client.java:1476)
          at org.apache.hadoop.ipc.Client.call(Client.java:140

Re: Creating new Spark context when running in Secure YARN fails

2015-11-11 Thread Ted Yu
Looks like the delegation token should be renewed.

Mind trying the following ?

Thanks

diff --git
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerB
index 20771f6..e3c4a5a 100644
---
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
+++
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
@@ -53,6 +53,12 @@ private[spark] class YarnClientSchedulerBackend(
 logDebug("ClientArguments called with: " + argsArrayBuf.mkString(" "))
 val args = new ClientArguments(argsArrayBuf.toArray, conf)
 totalExpectedExecutors = args.numExecutors
+// SPARK-8851: In yarn-client mode, the AM still does the credentials
refresh. The driver
+// reads the credentials from HDFS, just like the executors and
updates its own credentials
+// cache.
+if (conf.contains("spark.yarn.credentials.file")) {
+  YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
+}
 client = new Client(args, conf)
 appId = client.submitApplication()

@@ -63,12 +69,6 @@ private[spark] class YarnClientSchedulerBackend(

 waitForApplication()

-// SPARK-8851: In yarn-client mode, the AM still does the credentials
refresh. The driver
-// reads the credentials from HDFS, just like the executors and
updates its own credentials
-// cache.
-if (conf.contains("spark.yarn.credentials.file")) {
-  YarnSparkHadoopUtil.get.startExecutorDelegationTokenRenewer(conf)
-}
 monitorThread = asyncMonitorApplication()
 monitorThread.start()
   }

On Wed, Nov 11, 2015 at 10:23 AM, mvle  wrote:

> Hi,
>
> I've deployed a Secure YARN 2.7.1 cluster with HDFS encryption and am
> trying
> to run the pyspark shell using Spark 1.5.1
>
> pyspark shell works and I can run a sample code to calculate PI just fine.
> However, when I try to stop the current context (e.g., sc.stop()) and then
> create a new context (sc = SparkContext()), I get the error below.
>
> I have also seen errors such as: "token (HDFS_DELEGATION_TOKEN token 42 for
> hadoop) can't be found in cache",
>
> Does anyone know if it is possible to stop and create a new Spark context
> within a single JVM process (driver) and have that work when dealing with
> delegation tokens from Secure YARN/HDFS?
>
> Thanks.
>
> 15/11/11 10:19:53 INFO yarn.Client: Setting up container launch context for
> our AM
> 15/11/11 10:19:53 INFO yarn.Client: Setting up the launch environment for
> our AM container
> 15/11/11 10:19:53 INFO yarn.Client: Credentials file set to:
> credentials-37915c3e-1e90-44b9-add1-521598cea846
> 15/11/11 10:19:53 INFO yarn.YarnSparkHadoopUtil: getting token for
> namenode:
>
> hdfs://test6-allwkrbsec-001:9000/user/hadoop/.sparkStaging/application_1446695132208_0042
> 15/11/11 10:19:53 ERROR spark.SparkContext: Error initializing
> SparkContext.
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation
> Token
> can be issued only with kerberos or web authentication
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:6638)
> at
>
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:563)
> at
>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:987)
> at
>
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> at org.apache.hadoop.ipc.Client.call(Client.java:1407)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy12.getDelegationToken(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:933)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(