Hi! I created a spark cluster on openshift using radanalytics.io I'm trying to execute the SparPi sample using
spark-submit --name sparkpi-2 \--master spark://hans:7077 \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ /opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar 1 The task is submitted and I see RUNNING state in the UI but it never ends because the logs shows UnknownHostException 21/08/27 07:45:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(1017200000); groups with view permissions: Set(); users with modify permissions: Set(1017200000); groups with modify permissions: Set() Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:285) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) ... 4 more Caused by: java.io.IOException: Failed to connect to hans-w-1-7jgtb:44139 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.UnknownHostException: hans-w-1-7jgtb at java.net.InetAddress.getAllByName0(InetAddress.java:1281) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at java.net.InetAddress.getByName(InetAddress.java:1077) at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146) at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143) at java.security.AccessController.doPrivileged(Native Method) at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143) at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43) at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63) at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55) at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57) at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32) at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108) at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:202) at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:48) at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:182) at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:168) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604) at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:985) at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:505) at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:416) at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:475) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518) at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more I could resolve this with giving pods fix hostnames and create headless services for them. But with this it is difficult to scale the spark cluster. Is there another solution ??? Best Regards MARKUS Fiducia & GAD IT AG | www.fiduciagad.de AG Frankfurt a. M. HRB 102381 | Sitz der Gesellschaft: Frankfurt a. M. | USt-IdNr. DE 143582320 Vorstand: Martin Beyer (Vorstandssprecher), Ulrich Coenen (Vorstandssprecher), Daniela Bücker, Birgit Frohnhoff, Jörg Staff, Ralf Teufel Vorsitzender des Aufsichtsrats: Jürgen Brinkmann