Here is the log file from the worker node 15/09/30 23:49:37 INFO Worker: Executor app-20150930233113-0000/8 finished with state EXITED message Command exited with code 1 exitStatus \ 1 15/09/30 23:49:37 INFO Worker: Asked to launch executor app-20150930233113-0000/9 for PythonPi 15/09/30 23:49:37 INFO SecurityManager: Changing view acls to: juser 15/09/30 23:49:37 INFO SecurityManager: Changing modify acls to: juser 15/09/30 23:49:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(juser)\ ; users with modify permissions: Set(juser) 15/09/30 23:49:37 INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-8-oracle/jre/bin/java" "-cp" "/juser/press-mgmt/spark-1.5.0-bin-h\ adoop2.6/sbin/../conf/:/juser/press-mgmt/spark-1.5.0-bin-hadoop2.6/lib/spark-assembly-1.5.0-hadoop2.6.0.jar:/juser/press-mgmt/spark-1.5.0-b\ in-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/juser/press-mgmt/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/juser/press-mgm\ t/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar" "-Xms100M" "-Xmx100M" "-Dspark.driver.port=36363" "org.apache.spark.executor.C\ oarseGrainedExecutorBackend" "--driver-url" "akka.tcp:// sparkDriver@172.31.61.43:36363/user/CoarseGrainedScheduler" "--executor-id" "9" "--\ hostname" "172.31.51.246" "--cores" "2" "--app-id" "app-20150930233113-0000" "--worker-url" "akka.tcp:// sparkWorker@172.31.51.246:41893/use\ r/Worker" 15/09/30 23:51:40 INFO Worker: Executor app-20150930233113-0000/9 finished with state EXITED message Command exited with code 1 exitStatus \ 1 15/09/30 23:51:40 INFO Worker: Asked to launch executor app-20150930233113-0000/10 for PythonPi 15/09/30 23:51:40 INFO SecurityManager: Changing view acls to: juser 15/09/30 23:51:40 INFO SecurityManager: Changing modify acls to: juser 15/09/30 23:51:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(juser)\ ; users with modify permissions: Set(juser) 15/09/30 23:51:40 INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-8-oracle/jre/bin/java" "-cp" "/juser/press-mgmt/spark-1.5.0-bin-h\ adoop2.6/sbin/../conf/:/juser/press-mgmt/spark-1.5.0-bin-hadoop2.6/lib/spark-assembly-1.5.0-hadoop2.6.0.jar:/juser/press-mgmt/spark-1.5.0-b\ in-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/juser/press-mgmt/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/juser/press-mgm\ t/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar" "-Xms100M" "-Xmx100M" "-Dspark.driver.port=36363" "org.apache.spark.executor.C\ oarseGrainedExecutorBackend" "--driver-url" "akka.tcp:// sparkDriver@172.31.61.43:36363/user/CoarseGrainedScheduler" "--executor-id" "10" "-\ -hostname" "172.31.51.246" "--cores" "2" "--app-id" "app-20150930233113-0000" "--worker-url" "akka.tcp:// sparkWorker@172.31.51.246:41893/us\ er/Worker"
nothing stands out to me. also, one thing i learned from reading other posts is that we need the fully qualified hostname, and not identify machines simply by their hostname of IP address. i have not been using fully qualified hostname. could that cause this problem? On Wed, Sep 30, 2015 at 11:23 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: > Do you have the log file? It may be because of wrong settings. > > Best Regards, > Shixiong Zhu > > 2015-10-01 7:32 GMT+08:00 markluk <m...@juicero.com>: > >> I setup a new Spark cluster. My worker node is dying with the following >> exception. >> >> Caused by: java.util.concurrent.TimeoutException: Futures timed out after >> [120 seconds] >> at >> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) >> at >> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) >> at >> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) >> at >> >> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) >> at scala.concurrent.Await$.result(package.scala:107) >> at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241) >> ... 11 more >> >> >> Any ideas what's wrong? This is happening both for a spark program and >> spark >> shell. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Worker-node-timeout-exception-tp24893.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >