Re: Spark 2.3 error on Kubernetes

2018-05-29 Thread Anirudh Ramanathan
Interesting.
Perhaps you could try resolving service addresses from within a pod and
seeing if there's some other issue causing intermittent failures in
resolution.
The steps here

may
be helpful.

On Tue, May 29, 2018 at 4:02 PM, purna pradeep 
wrote:

> Abirudh,
>
> Thanks for your response
>
> I’m running k8s cluster on AWS and kub-dns pods are running fine and also
> as I mentioned only 1 executor pod is running though I requested for 5 and
> rest 4 were killed with below error and I do have enough resources
> available.
>
> On Tue, May 29, 2018 at 6:28 PM Anirudh Ramanathan 
> wrote:
>
>> This looks to me like a kube-dns error that's causing the driver DNS
>> address to not resolve.
>> It would be worth double checking that kube-dns is indeed running (in the
>> kube-system namespace).
>> Often, with environments like minikube, kube-dns may exit/crashloop due
>> to lack of resource.
>>
>> On Tue, May 29, 2018 at 3:18 PM, purna pradeep 
>> wrote:
>>
>>> Hello,
>>>
>>> I’m getting below  error when I spark-submit a Spark 2.3 app on
>>> Kubernetes *v1.8.3* , some of the executor pods  were killed with below
>>> error as soon as they come up
>>>
>>> Exception in thread "main" java.lang.reflect.
>>> UndeclaredThrowableException
>>>
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(
>>> UserGroupInformation.java:1713)
>>>
>>> at org.apache.spark.deploy.SparkHadoopUtil.
>>> runAsSparkUser(SparkHadoopUtil.scala:64)
>>>
>>> at org.apache.spark.executor.
>>> CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.
>>> scala:188)
>>>
>>> at org.apache.spark.executor.
>>> CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.
>>> scala:293)
>>>
>>> at org.apache.spark.executor.
>>> CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
>>>
>>> Caused by: org.apache.spark.SparkException: Exception thrown in
>>> awaitResult:
>>>
>>> at org.apache.spark.util.ThreadUtils$.awaitResult(
>>> ThreadUtils.scala:205)
>>>
>>> at org.apache.spark.rpc.RpcTimeout.awaitResult(
>>> RpcTimeout.scala:75)
>>>
>>> at org.apache.spark.rpc.RpcEnv.
>>> setupEndpointRefByURI(RpcEnv.scala:101)
>>>
>>> at org.apache.spark.executor.
>>> CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(
>>> CoarseGrainedExecutorBackend.scala:201)
>>>
>>> at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(
>>> SparkHadoopUtil.scala:65)
>>>
>>> at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(
>>> SparkHadoopUtil.scala:64)
>>>
>>> at java.security.AccessController.doPrivileged(Native
>>> Method)
>>>
>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(
>>> UserGroupInformation.java:1698)
>>>
>>> ... 4 more
>>>
>>> Caused by: java.io.IOException: Failed to connect to
>>> spark-1527629824987-driver-svc.spark.svc:7078
>>>
>>> at org.apache.spark.network.
>>> client.TransportClientFactory.createClient(TransportClientFactory.java:
>>> 245)
>>>
>>> at org.apache.spark.network.
>>> client.TransportClientFactory.createClient(TransportClientFactory.java:
>>> 187)
>>>
>>> at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(
>>> NettyRpcEnv.scala:198)
>>>
>>> at org.apache.spark.rpc.netty.
>>> Outbox$$anon$1.call(Outbox.scala:194)
>>>
>>> at org.apache.spark.rpc.netty.
>>> Outbox$$anon$1.call(Outbox.scala:190)
>>>
>>> at java.util.concurrent.FutureTask.run(FutureTask.
>>> java:266)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1149)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:624)
>>>
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>> Caused by: java.net.UnknownHostException: spark-1527629824987-driver-
>>> svc.spark.svc
>>>
>>> at java.net.InetAddress.getAllByName0(InetAddress.
>>> java:1280)
>>>
>>> at java.net.InetAddress.getAllByName(InetAddress.java:
>>> 1192)
>>>
>>> at java.net.InetAddress.getAllByName(InetAddress.java:
>>> 1126)
>>>
>>> at java.net.InetAddress.getByName(InetAddress.java:1076)
>>>
>>> at io.netty.util.internal.SocketUtils$8.run(SocketUtils.
>>> java:146)
>>>
>>> at io.netty.util.internal.SocketUtils$8.run(SocketUtils.
>>> java:143)
>>>
>>> at java.security.AccessController.doPrivileged(Native
>>> Method)
>>>
>>> at io.netty.util.internal.SocketUtils.addressByName(
>>> SocketUtils.java:143)
>>>
>>>  

Re: Spark 2.3 error on Kubernetes

2018-05-29 Thread purna pradeep
Abirudh,

Thanks for your response

I’m running k8s cluster on AWS and kub-dns pods are running fine and also
as I mentioned only 1 executor pod is running though I requested for 5 and
rest 4 were killed with below error and I do have enough resources
available.

On Tue, May 29, 2018 at 6:28 PM Anirudh Ramanathan 
wrote:

> This looks to me like a kube-dns error that's causing the driver DNS
> address to not resolve.
> It would be worth double checking that kube-dns is indeed running (in the
> kube-system namespace).
> Often, with environments like minikube, kube-dns may exit/crashloop due to
> lack of resource.
>
> On Tue, May 29, 2018 at 3:18 PM, purna pradeep 
> wrote:
>
>> Hello,
>>
>> I’m getting below  error when I spark-submit a Spark 2.3 app on
>> Kubernetes *v1.8.3* , some of the executor pods  were killed with below
>> error as soon as they come up
>>
>> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
>>
>> at
>> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
>>
>> at
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
>>
>> at
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)
>>
>> at
>> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
>>
>> Caused by: org.apache.spark.SparkException: Exception thrown in
>> awaitResult:
>>
>> at
>> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
>>
>> at
>> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
>>
>> at
>> org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
>>
>> at
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
>>
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
>>
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
>>
>> at java.security.AccessController.doPrivileged(Native
>> Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>>
>> ... 4 more
>>
>> Caused by: java.io.IOException: Failed to connect to
>> spark-1527629824987-driver-svc.spark.svc:7078
>>
>> at
>> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
>>
>> at
>> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
>>
>> at
>> org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
>>
>> at
>> org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
>>
>> at
>> org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
>>
>> at
>> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>
>> at java.lang.Thread.run(Thread.java:748)
>>
>> Caused by: java.net.UnknownHostException:
>> spark-1527629824987-driver-svc.spark.svc
>>
>> at
>> java.net.InetAddress.getAllByName0(InetAddress.java:1280)
>>
>> at
>> java.net.InetAddress.getAllByName(InetAddress.java:1192)
>>
>> at
>> java.net.InetAddress.getAllByName(InetAddress.java:1126)
>>
>> at java.net.InetAddress.getByName(InetAddress.java:1076)
>>
>> at
>> io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
>>
>> at
>> io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
>>
>> at java.security.AccessController.doPrivileged(Native
>> Method)
>>
>> at
>> io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
>>
>> at
>> io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
>>
>> at
>> io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
>>
>> at
>> io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
>>
>> at
>> io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
>>
>> at
>> io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
>>
>> at
>> 

Re: Spark 2.3 error on Kubernetes

2018-05-29 Thread Anirudh Ramanathan
This looks to me like a kube-dns error that's causing the driver DNS
address to not resolve.
It would be worth double checking that kube-dns is indeed running (in the
kube-system namespace).
Often, with environments like minikube, kube-dns may exit/crashloop due to
lack of resource.

On Tue, May 29, 2018 at 3:18 PM, purna pradeep 
wrote:

> Hello,
>
> I’m getting below  error when I spark-submit a Spark 2.3 app on Kubernetes
> *v1.8.3* , some of the executor pods  were killed with below error as
> soon as they come up
>
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1713)
>
> at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(
> SparkHadoopUtil.scala:64)
>
> at org.apache.spark.executor.
> CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
>
> at org.apache.spark.executor.
> CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)
>
> at org.apache.spark.executor.CoarseGrainedExecutorBackend.
> main(CoarseGrainedExecutorBackend.scala)
>
> Caused by: org.apache.spark.SparkException: Exception thrown in
> awaitResult:
>
> at org.apache.spark.util.ThreadUtils$.awaitResult(
> ThreadUtils.scala:205)
>
> at org.apache.spark.rpc.RpcTimeout.awaitResult(
> RpcTimeout.scala:75)
>
> at org.apache.spark.rpc.RpcEnv.
> setupEndpointRefByURI(RpcEnv.scala:101)
>
> at org.apache.spark.executor.
> CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(
> CoarseGrainedExecutorBackend.scala:201)
>
> at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(
> SparkHadoopUtil.scala:65)
>
> at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(
> SparkHadoopUtil.scala:64)
>
> at java.security.AccessController.doPrivileged(Native
> Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1698)
>
> ... 4 more
>
> Caused by: java.io.IOException: Failed to connect to
> spark-1527629824987-driver-svc.spark.svc:7078
>
> at org.apache.spark.network.client.TransportClientFactory.
> createClient(TransportClientFactory.java:245)
>
> at org.apache.spark.network.client.TransportClientFactory.
> createClient(TransportClientFactory.java:187)
>
> at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(
> NettyRpcEnv.scala:198)
>
> at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.
> scala:194)
>
> at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.
> scala:190)
>
> at java.util.concurrent.FutureTask.run(FutureTask.
> java:266)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Caused by: java.net.UnknownHostException: spark-1527629824987-driver-
> svc.spark.svc
>
> at java.net.InetAddress.getAllByName0(InetAddress.
> java:1280)
>
> at java.net.InetAddress.getAllByName(InetAddress.java:
> 1192)
>
> at java.net.InetAddress.getAllByName(InetAddress.java:
> 1126)
>
> at java.net.InetAddress.getByName(InetAddress.java:1076)
>
> at io.netty.util.internal.SocketUtils$8.run(SocketUtils.
> java:146)
>
> at io.netty.util.internal.SocketUtils$8.run(SocketUtils.
> java:143)
>
> at java.security.AccessController.doPrivileged(Native
> Method)
>
> at io.netty.util.internal.SocketUtils.addressByName(
> SocketUtils.java:143)
>
> at io.netty.resolver.DefaultNameResolver.doResolve(
> DefaultNameResolver.java:43)
>
> at io.netty.resolver.SimpleNameResolver.resolve(
> SimpleNameResolver.java:63)
>
> at io.netty.resolver.SimpleNameResolver.resolve(
> SimpleNameResolver.java:55)
>
> at io.netty.resolver.InetSocketAddressResolver.doResolve(
> InetSocketAddressResolver.java:57)
>
> at io.netty.resolver.InetSocketAddressResolver.doResolve(
> InetSocketAddressResolver.java:32)
>
> at io.netty.resolver.AbstractAddressResolver.resolve(
> AbstractAddressResolver.java:108)
>
> at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(
> Bootstrap.java:208)
>
> at io.netty.bootstrap.Bootstrap.
> access$000(Bootstrap.java:49)
>
> at io.netty.bootstrap.Bootstrap$
> 1.operationComplete(Bootstrap.java:188)
>
> at io.netty.bootstrap.Bootstrap$
> 

Spark 2.3 error on Kubernetes

2018-05-29 Thread purna pradeep
Hello,

I’m getting below  error when I spark-submit a Spark 2.3 app on Kubernetes
*v1.8.3* , some of the executor pods  were killed with below error as soon
as they come up

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)

at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)

at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)

at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)

at
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)

Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:

at
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)

at
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

at
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)

at
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)

at
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)

at
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)

at java.security.AccessController.doPrivileged(Native
Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)

... 4 more

Caused by: java.io.IOException: Failed to connect to
spark-1527629824987-driver-svc.spark.svc:7078

at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)

at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)

at
org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)

at
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)

at
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.net.UnknownHostException:
spark-1527629824987-driver-svc.spark.svc

at java.net.InetAddress.getAllByName0(InetAddress.java:1280)

at java.net.InetAddress.getAllByName(InetAddress.java:1192)

at java.net.InetAddress.getAllByName(InetAddress.java:1126)

at java.net.InetAddress.getByName(InetAddress.java:1076)

at
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)

at
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)

at java.security.AccessController.doPrivileged(Native
Method)

at
io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)

at
io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)

at
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)

at
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)

at
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)

at
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)

at
io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)

at
io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208)

at
io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:49)

at
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:188)

at
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:174)

at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)

at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)

at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)

at
io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)

at
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)

at

Spark 2.3 error on kubernetes

2018-05-29 Thread Mamillapalli, Purna Pradeep
Hello,


I’m getting below intermittent error when I spark-submit a Spark 2.3 app on 
Kubernetes v1.8.3 , some of the executor pods  were killed with below error as 
soon as they come up


Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at 
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to 
spark-1527629824987-driver-svc.spark.svc:7078
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at 
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at 
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: 
spark-1527629824987-driver-svc.spark.svc
at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at 
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at 
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
at java.security.AccessController.doPrivileged(Native Method)
at 
io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
at 
io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
at 
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
at 
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
at 
io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
at 
io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:49)
at 
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:188)
at 
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:174)
at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at 
io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at 
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at 

Spark 2.3 error on kubernetes

2018-05-29 Thread Mamillapalli, Purna Pradeep
Hello,


I’m getting below intermittent error when I spark-submit a Spark 2.3 app on 
Kubernetes v1.8.3 , some of the executor pods  were killed with below error as 
soon as they come up


Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at 
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to 
spark-1527629824987-driver-svc.spark.svc:7078
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at 
org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at 
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at 
org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: 
spark-1527629824987-driver-svc.spark.svc
at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at 
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at 
io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
at java.security.AccessController.doPrivileged(Native Method)
at 
io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
at 
io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
at 
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
at 
io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
at 
io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
at 
io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:49)
at 
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:188)
at 
io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:174)
at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at 
io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at 
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at