[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-08-10 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680109#comment-14680109
 ] 

Thomas Graves commented on SPARK-9019:
--

We forgot to close out the jira. This was fixed by SPARK-8988.  Comments in the 
pr if people are interested.

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn
 Attachments: debug-log-spark-1.5-fail, spark-submit-log-1.5.0-fail


 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-08-07 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662211#comment-14662211
 ] 

Steve Loughran commented on SPARK-9019:
---

# If this problem exists (I don't have test setup right now for various 
reasons) then it is a regression from 1.3
# Like Thomas says, RM client tokens should get down to the AM automatically. 
If, however, these tokens are needed in the containers, then a delegation token 
is going to be needed —presumably that is what this patch does. However, that 
token will expire  then a new one is needed; SPARK-5342 was meant to address 
that. It should be creating the tokens  providing them on demand. Something is 
playing up there.

Regarding the patch,  I don't know how well it would work in an RM-HA 
environment. Someone who understands the details for HA YARN would need to look 
at it

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn
 Attachments: debug-log-spark-1.5-fail, spark-submit-log-1.5.0-fail


 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-20 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633623#comment-14633623
 ] 

Thomas Graves commented on SPARK-9019:
--

The RMdelegationtoken is only needed if the application is doing things like 
submitting other applications, killing applications, etc. oozie uses this to 
launch jobs. We do not need to acquire it to just run the spark application on 
YARN.]

Are you doing something special to try to launch another job?

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-20 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633807#comment-14633807
 ] 

Bolke de Bruin commented on SPARK-9019:
---

As mentioned in the PR request the traces on the different clusters in the PR 
(not here)  are from running the Pi example. My analysis shows that there has 
been a behavior change between spark 1.3.0 and spark 1.5. 

It could be helpful is someone else does the same with debugging logging turned 
on for the application and to compare it with mine?

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-18 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632341#comment-14632341
 ] 

Bolke de Bruin commented on SPARK-9019:
---

I have created PR-#7489 for this issue. 

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-18 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632340#comment-14632340
 ] 

Apache Spark commented on SPARK-9019:
-

User 'bolkedebruin' has created a pull request for this issue:
https://github.com/apache/spark/pull/7489

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-17 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631367#comment-14631367
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Ok. I tracked down the log entry due to a missing ResourceManager Delegation 
token. To get rid of this the following needs to be added to Client.scala in 
prepareLocalStorage. Of course with the relevant imports. I am currently 
testing if this solves to final issues as well, if so I will prepare a patch.

obtainTokenForHBase(hadoopConf, credentials)

logInfo(Requesting RM delegation token)
val rmAddress = hadoopConf.getSocketAddr(YarnConfiguration.RM_ADDRESS, 
YarnConfiguration.DEFAULT_RM_ADDRESS, YarnConfiguration.DEFAULT_RM_PORT)
val renewer = 
SecurityUtil.getServerPrincipal(hadoopConf.get(YarnConfiguration.RM_PRINCIPAL), 
rmAddress.getHostName) 
val protoToken = yarnClient.getRMDelegationToken(new Text(renewer))
val token = ConverterUtils.convertFromYarn(protoToken, rmAddress)
credentials.addToken(new Text(token.getService), token)
logInfo(sRM delegation token added: service ${token.getService} with 
renewer ${renewer} host was ${rmAddress.getHostName} and principal 
${hadoopConf.get(YarnConfiguration.RM_PRINCIPAL)})


 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-16 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629432#comment-14629432
 ] 

Bolke de Bruin commented on SPARK-9019:
---

I tried running this on an update environment, however it still fails although 
behavior is a bit different now. The task is now being accepted but stays in 
the running state forever without executing anything. Please note that the 
trace below is without key tab usage, but with an authorized user (kinit 
admin/admin)

15/07/16 04:27:34 DEBUG Client: getting client out of cache: 
org.apache.hadoop.ipc.Client@53abb73
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: 
[actor] received message AkkaMessage(ReviveOffers,false) from 
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: 
Received RPC message: AkkaMessage(ReviveOffers,false)
15/07/16 04:27:34 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: 
[actor] handled message (1.632126 ms) AkkaMessage(ReviveOffers,false) from 
Actor[akka://sparkDriver/deadLetters]
15/07/16 04:27:34 DEBUG AbstractService: Service 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl is started
15/07/16 04:27:34 DEBUG AbstractService: Service 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started
15/07/16 04:27:34 DEBUG Client: The ping interval is 6 ms.
15/07/16 04:27:34 DEBUG Client: Connecting to node6.local/10.79.10.6:8050
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin 
(auth:SIMPLE) 
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/16 04:27:34 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE

15/07/16 04:27:34 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE
auths {
  method: TOKEN
  mechanism: DIGEST-MD5
  protocol: 
  serverId: default
  challenge: 
realm=\default\,nonce=\wjgFp9L22uDJt41FNtY9M8CP/T+dswfBoF48r9+s\,qop=\auth\,charset=utf-8,algorithm=md5-sess
}
auths {
  method: KERBEROS
  mechanism: GSSAPI
  protocol: rm
  serverId: node6.local
}

15/07/16 04:27:34 DEBUG SaslRpcClient: Get token info proto:interface 
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB 
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@69990fa7
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Looking for a token with 
service 10.79.10.6:8050
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is 
YARN_AM_RM_TOKEN and the token's service name is 
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is 
HIVE_DELEGATION_TOKEN and the token's service name is 
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is 
TIMELINE_DELEGATION_TOKEN and the token's service name is 10.79.10.6:8188
15/07/16 04:27:34 DEBUG RMDelegationTokenSelector: Token kind is 
HDFS_DELEGATION_TOKEN and the token's service name is 10.79.10.4:8020
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException 
as:admin (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: 
Client cannot authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedAction as:admin 
(auth:SIMPLE) 
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
15/07/16 04:27:34 WARN Client: Exception encountered while connecting to the 
server : org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG UserGroupInformation: PrivilegedActionException 
as:admin (auth:SIMPLE) cause:java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
15/07/16 04:27:34 DEBUG Client: closing ipc connection to 
node6.local/10.79.10.6:8050: org.apache.hadoop.security.AccessControlException: 
Client cannot authenticate via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625980#comment-14625980
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Tracing this down it seems that the tokens are not being set on the container 
in yarn.Client, which is required according to 
http://aajisaka.github.io/hadoop-project/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html.

something like this:

  ByteBuffer fsTokens = ByteBuffer.wrap(dob.getData(), 0, dob.getLength());
  amContainer.setTokens(fsTokens);

in createContainerLaunchContext of 
yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626005#comment-14626005
 ] 

Sean Owen commented on SPARK-9019:
--

Same as SPARK-8851?

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626043#comment-14626043
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Will try in a few minutes, however it did not only happen when using keytabs. 
Also when using the user's own credentials.

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626183#comment-14626183
 ] 

Bolke de Bruin commented on SPARK-9019:
---

[~srowen] unfortunately the patch from SPARK-8851 did not solve the issue. 
Trace remains the same.

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626113#comment-14626113
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Now with debug info (not yet with patch):

15/07/14 11:03:49 DEBUG UserGroupInformation: PrivilegedAction as:yx66jx 
(auth:SIMPLE) 
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
15/07/14 11:03:49 DEBUG SaslRpcClient: Sending sasl message state: NEGOTIATE

15/07/14 11:03:49 DEBUG SaslRpcClient: Received SASL message state: NEGOTIATE
auths {
  method: TOKEN
  mechanism: DIGEST-MD5
  protocol: 
  serverId: default
  challenge: 
realm=\default\,nonce=\XXX\,qop=\auth\,charset=utf-8,algorithm=md5-sess
}
auths {
  method: KERBEROS
  mechanism: GSSAPI
  protocol: rm
  serverId: lxhnl002.ad.ing.net
}

15/07/14 11:03:49 DEBUG SaslRpcClient: Get token info proto:interface 
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB 
info:org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo$2@5c53714b
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Looking for a token with 
service 10.111.114.16:8032
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
YARN_AM_RM_TOKEN and the token's service name is 
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
HIVE_DELEGATION_TOKEN and the token's service name is 
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
TIMELINE_DELEGATION_TOKEN and the token's service name is 10.111.114.16:8188
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
HDFS_DELEGATION_TOKEN and the token's service name is 10.111.114.16:8020
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
HDFS_DELEGATION_TOKEN and the token's service name is 10.111.114.17:8020
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
HDFS_DELEGATION_TOKEN and the token's service name is ha-hdfs:hdpnlcb
15/07/14 11:03:49 DEBUG UserGroupInformation: PrivilegedActionException 
as:yx66jx (auth:SIMPLE) 
cause:org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
15/07/14 11:03:49 DEBUG UserGroupInformation: PrivilegedAction as:yx66jx 
(auth:SIMPLE) 
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
15/07/14 11:03:49 WARN Client: Exception encountered while connecting to the 
server : org.apache.hadoop.security.AccessControlException: Client cannot 
authenticate via:[TOKEN, KERBEROS]
15/07/14 11:03:49 DEBUG UserGroupInformation: PrivilegedActionException 
as:yx66jx (auth:SIMPLE) cause:java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]



auth:SIMPLE is what worries me.

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626256#comment-14626256
 ] 

Bolke de Bruin commented on SPARK-9019:
---

And some more debugging information. Please note the selected auth:SIMPLE 
method.



15/07/14 11:03:45 INFO ApplicationMaster: Registered signal handlers for [TERM, 
HUP, INT]
15/07/14 11:03:45 DEBUG Shell: setsid exited with exit code 0
15/07/14 11:03:45 DEBUG MutableMetricsFactory: field 
org.apache.hadoop.metrics2.lib.MutableRate 
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of 
successful kerberos logins and latency (milliseconds)], about=, valueName=Time, 
type=DEFAULT, always=false, sampleName=Ops)
15/07/14 11:03:45 DEBUG MutableMetricsFactory: field 
org.apache.hadoop.metrics2.lib.MutableRate 
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[Rate of failed 
kerberos logins and latency (milliseconds)], about=, valueName=Time, 
type=DEFAULT, always=false, sampleName=Ops)
15/07/14 11:03:45 DEBUG MutableMetricsFactory: field 
org.apache.hadoop.metrics2.lib.MutableRate 
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(value=[GetGroups], 
about=, valueName=Time, type=DEFAULT, always=false, sampleName=Ops)
15/07/14 11:03:45 DEBUG MetricsSystemImpl: UgiMetrics, User and group related 
metrics
15/07/14 11:03:45 DEBUG Groups:  Creating new Groups object
15/07/14 11:03:45 DEBUG NativeCodeLoader: Trying to load the custom-built 
native-hadoop library...
15/07/14 11:03:45 DEBUG NativeCodeLoader: Failed to load native-hadoop with 
error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
15/07/14 11:03:45 DEBUG NativeCodeLoader: 
java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15/07/14 11:03:45 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/07/14 11:03:45 DEBUG PerformanceAdvisory: Falling back to shell based
15/07/14 11:03:45 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping 
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
15/07/14 11:03:45 DEBUG Groups: Group mapping 
impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; 
cacheTimeout=30; warningDeltaMs=5000
15/07/14 11:03:45 DEBUG YarnSparkHadoopUtil: running as user: yx66jx
15/07/14 11:03:45 DEBUG UserGroupInformation: hadoop login
15/07/14 11:03:45 DEBUG UserGroupInformation: hadoop login commit
15/07/14 11:03:45 DEBUG UserGroupInformation: using kerberos user:null
15/07/14 11:03:45 DEBUG UserGroupInformation: using local user:UnixPrincipal: 
yx66jx
15/07/14 11:03:45 DEBUG UserGroupInformation: Using user: UnixPrincipal: 
yx66jx with name yx66jx
15/07/14 11:03:45 DEBUG UserGroupInformation: User entry: yx66jx
15/07/14 11:03:45 DEBUG UserGroupInformation: UGI loginUser:yx66jx 
(auth:KERBEROS)
15/07/14 11:03:45 DEBUG UserGroupInformation: PrivilegedAction as:yx66jx 
(auth:SIMPLE) 
from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:65)
15/07/14 11:03:46 INFO ApplicationMaster: ApplicationAttemptId: 
appattempt_1436783220608_0085_01
15/07/14 11:03:46 DEBUG BlockReaderLocal: 
dfs.client.use.legacy.blockreader.local = false
15/07/14 11:03:46 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = true
15/07/14 11:03:46 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic 
= false
15/07/14 11:03:46 DEBUG BlockReaderLocal: dfs.domain.socket.path = 
/var/lib/hadoop-hdfs/dn_socket

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626300#comment-14626300
 ] 

Bolke de Bruin commented on SPARK-9019:
---

It might be that we have a configuration issue (but Im not sure):

15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Looking for a token with 
service 10.111.114.16:8032
15/07/14 11:03:49 DEBUG RMDelegationTokenSelector: Token kind is 
YARN_AM_RM_TOKEN and the token's service name is 

I think that should match

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-14 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626574#comment-14626574
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Can this be related to YARN-3103?

 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at com.sun.proxy.$Proxy25.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:475)
   at 
 org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend$$anonfun$getDriverLogUrls$1.apply(YarnClusterSchedulerBackend.scala:92)
   at 
 

[jira] [Commented] (SPARK-9019) spark-submit fails on yarn with kerberos enabled

2015-07-13 Thread Bolke de Bruin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625349#comment-14625349
 ] 

Bolke de Bruin commented on SPARK-9019:
---

Please note that keytab was successfully uploaded:

15/07/13 22:48:17 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/07/13 22:48:18 INFO yarn.Client: Attempting to login to the Kerberos using 
principal: sparkjob and keytab: /etc/security/keytabs/sparkjob.keytab
15/07/13 22:48:18 INFO security.UserGroupInformation: Login successful for user 
sparkjob using keytab file /etc/security/keytabs/sparkjob.keytab
15/07/13 22:48:18 INFO yarn.Client: Successfully logged into the KDC.


 spark-submit fails on yarn with kerberos enabled
 

 Key: SPARK-9019
 URL: https://issues.apache.org/jira/browse/SPARK-9019
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.5.0
 Environment: Hadoop 2.6 with YARN and kerberos enabled
Reporter: Bolke de Bruin
  Labels: kerberos, spark-submit, yarn

 It is not possible to run jobs using spark-submit on yarn with a kerberized 
 cluster. 
 Commandline:
 /usr/hdp/2.2.0.0-2041/spark-1.5.0/bin/spark-submit --principal sparkjob 
 --keytab sparkjob.keytab --num-executors 3 --executor-cores 5 
 --executor-memory 5G --master yarn-cluster /tmp/get_peers.py 
 Fails with:
 15/07/13 22:48:31 INFO server.Server: jetty-8.y.z-SNAPSHOT
 15/07/13 22:48:31 INFO server.AbstractConnector: Started 
 SelectChannelConnector@0.0.0.0:58380
 15/07/13 22:48:31 INFO util.Utils: Successfully started service 'SparkUI' on 
 port 58380.
 15/07/13 22:48:31 INFO ui.SparkUI: Started SparkUI at 
 http://10.111.114.9:58380
 15/07/13 22:48:31 INFO cluster.YarnClusterScheduler: Created 
 YarnClusterScheduler
 15/07/13 22:48:31 WARN metrics.MetricsSystem: Using default name DAGScheduler 
 for source because spark.app.id is not set.
 15/07/13 22:48:32 INFO util.Utils: Successfully started service 
 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43470.
 15/07/13 22:48:32 INFO netty.NettyBlockTransferService: Server created on 
 43470
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Trying to register 
 BlockManager
 15/07/13 22:48:32 INFO storage.BlockManagerMasterEndpoint: Registering block 
 manager 10.111.114.9:43470 with 265.1 MB RAM, BlockManagerId(driver, 
 10.111.114.9, 43470)
 15/07/13 22:48:32 INFO storage.BlockManagerMaster: Registered BlockManager
 15/07/13 22:48:32 INFO impl.TimelineClientImpl: Timeline service address: 
 http://lxhnl002.ad.ing.net:8188/ws/v1/timeline/
 15/07/13 22:48:33 WARN ipc.Client: Exception encountered while connecting to 
 the server : org.apache.hadoop.security.AccessControlException: Client cannot 
 authenticate via:[TOKEN, KERBEROS]
 15/07/13 22:48:33 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
 to rm2
 15/07/13 22:48:33 INFO retry.RetryInvocationHandler: Exception while invoking 
 getClusterNodes of class ApplicationClientProtocolPBClientImpl over rm2 after 
 1 fail over attempts. Trying to fail over after sleeping for 32582ms.
 java.net.ConnectException: Call From lxhnl006.ad.ing.net/10.111.114.9 to 
 lxhnl013.ad.ing.net:8032 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
   at com.sun.proxy.$Proxy24.getClusterNodes(Unknown Source)
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:262)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)