I see the following error time to time when try to start slaves on spark 1.4.0
[hadoop@ip-10-0-27-240 apps]$ pwd /mnt/var/log/apps [hadoop@ip-10-0-27-240 apps]$ cat spark-hadoop-org.apache.spark.deploy.worker.Worker-1-ip-10-0-27-240.ec2.internal.out Spark Command: /usr/java/latest/bin/java -cp /home/hadoop/spark/conf/:/home/hadoop/conf/:/home/hadoop/spark/classpath/distsupplied/*:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar:/usr/share/aws/emr/auxlib/*:/home/hadoop/.versions/spark-1.4.0.b/sbin/../conf/:/home/hadoop/.versions/spark-1.4.0.b/lib/spark-assembly-1.4.0-hadoop2.4.0.jar:/home/hadoop/.versions/spark-1.4.0.b/lib/datanucleus-core-3.2.10.jar:/home/hadoop/.versions/spark-1.4.0.b/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/.versions/spark-1.4.0.b/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/conf/:/home/hadoop/conf/ -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -Xms2048m -Xmx2048m -XX:MaxPermSize=128m org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://ip-10-0-27-185.ec2.internal:7077 ======================================== 15/08/27 21:10:25 INFO Worker: Registered signal handlers for [TERM, HUP, INT] 15/08/27 21:10:26 INFO SecurityManager: Changing view acls to: hadoop 15/08/27 21:10:26 INFO SecurityManager: Changing modify acls to: hadoop 15/08/27 21:10:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop) 15/08/27 21:10:26 INFO Slf4jLogger: Slf4jLogger started 15/08/27 21:10:26 INFO Remoting: Starting remoting Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at akka.remote.Remoting.start(Remoting.scala:180) at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184) at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618) at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615) at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615) at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632) at akka.actor.ActorSystem$.apply(ActorSystem.scala:141) at akka.actor.ActorSystem$.apply(ActorSystem.scala:118) at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:122) at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54) at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982) at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56) at org.apache.spark.deploy.worker.Worker$.startSystemAndActor(Worker.scala:553) at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:533) at org.apache.spark.deploy.worker.Worker.main(Worker.scala) 15/08/27 21:10:39 INFO Utils: Shutdown hook called Heap par new generation total 613440K, used 338393K [0x0000000778000000, 0x00000007a1990000, 0x00000007a1990000) eden space 545344K, 62% used [0x0000000778000000, 0x000000078ca765b0, 0x0000000799490000) from space 68096K, 0% used [0x0000000799490000, 0x0000000799490000, 0x000000079d710000) to space 68096K, 0% used [0x000000079d710000, 0x000000079d710000, 0x00000007a1990000) concurrent mark-sweep generation total 1415616K, used 0K [0x00000007a1990000, 0x00000007f8000000, 0x00000007f8000000) concurrent-mark-sweep perm gen total 21248K, used 19285K [0x00000007f8000000, 0x00000007f94c0000, 0x0000000800000000)