Re: Spark @ EC2: Futures timed out Ask timed out
Did you launch the cluster using spark-ec2 script? Just make sure all ports are open for master, slave instances security group. From the error, it seems its not able to connect to the driver program (port 58360) Thanks Best Regards On Tue, Mar 17, 2015 at 3:26 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, I've been trying to run a simple SparkWordCount app on EC2, but it looks like my apps are not succeeding/completing. I'm suspecting some sort of communication issue. I used the SparkWordCount app from http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/ Digging through logs I found this: 15/03/16 21:28:20 INFO Utils: Successfully started service 'driverPropsFetcher' on port 58123. Exception in thread main java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1563) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:60) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:115) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:163) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) * Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] * at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) ... 4 more Or exceptions like: *Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://sparkDriver@ip-10-111-222-111.ec2.internal:58360/), Path(/user/CoarseGrainedScheduler)]] after [3 ms] * at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333) at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117) at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694) at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691) at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467) at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419) at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423) at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) at java.lang.Thread.run(Thread.java:745) This is in EC2 and I have ports 22, 7077, 8080, and 8081 open to any source. But maybe I need to do something, too? I do see Master sees Workers and Workers do connect to the Master. I did run this in spark-shell, and it runs without problems; scala val something = sc.parallelize(1 to 1000).collect().filter(_1000 This is how I submitted the job (on the Master machine): $ spark-1.2.1-bin-hadoop2.4/bin/spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --executor-memory 256m --master spark://ip-10-171-32-62:7077 wc-spark/target/sparkwordcount-0.0.1-SNAPSHOT.jar /usr/share/dict/words 0 Any help would be greatly appreciated. Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/
Re: Spark @ EC2: Futures timed out Ask timed out
Hi Akhil, Thanks! I think that was it. Had to open a bunch of ports (didn't use spark-ec2, so it didn't do that for me) and the app works fine now. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Mar 17, 2015 at 3:26 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Did you launch the cluster using spark-ec2 script? Just make sure all ports are open for master, slave instances security group. From the error, it seems its not able to connect to the driver program (port 58360) Thanks Best Regards On Tue, Mar 17, 2015 at 3:26 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, I've been trying to run a simple SparkWordCount app on EC2, but it looks like my apps are not succeeding/completing. I'm suspecting some sort of communication issue. I used the SparkWordCount app from http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/ Digging through logs I found this: 15/03/16 21:28:20 INFO Utils: Successfully started service 'driverPropsFetcher' on port 58123. Exception in thread main java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1563) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:60) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:115) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:163) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) * Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] * at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) ... 4 more Or exceptions like: *Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://sparkDriver@ip-10-111-222-111.ec2.internal:58360/), Path(/user/CoarseGrainedScheduler)]] after [3 ms] * at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333) at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117) at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694) at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691) at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467) at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419) at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423) at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) at java.lang.Thread.run(Thread.java:745) This is in EC2 and I have ports 22, 7077, 8080, and 8081 open to any source. But maybe I need to do something, too? I do see Master sees Workers and Workers do connect to the Master. I did run this in spark-shell, and it runs without problems; scala val something = sc.parallelize(1 to 1000).collect().filter(_1000 This is how I submitted the job (on the Master machine): $ spark-1.2.1-bin-hadoop2.4/bin/spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --executor-memory 256m --master spark://ip-10-171-32-62:7077 wc-spark/target/sparkwordcount-0.0.1-SNAPSHOT.jar /usr/share/dict/words 0 Any help would be greatly appreciated. Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/
Spark @ EC2: Futures timed out Ask timed out
Hi, I've been trying to run a simple SparkWordCount app on EC2, but it looks like my apps are not succeeding/completing. I'm suspecting some sort of communication issue. I used the SparkWordCount app from http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/ Digging through logs I found this: 15/03/16 21:28:20 INFO Utils: Successfully started service 'driverPropsFetcher' on port 58123. Exception in thread main java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1563) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:60) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:115) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:163) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) * Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] * at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) ... 4 more Or exceptions like: *Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://sparkDriver@ip-10-111-222-111.ec2.internal:58360/), Path(/user/CoarseGrainedScheduler)]] after [3 ms] * at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333) at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117) at scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694) at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691) at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467) at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419) at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423) at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375) at java.lang.Thread.run(Thread.java:745) This is in EC2 and I have ports 22, 7077, 8080, and 8081 open to any source. But maybe I need to do something, too? I do see Master sees Workers and Workers do connect to the Master. I did run this in spark-shell, and it runs without problems; scala val something = sc.parallelize(1 to 1000).collect().filter(_1000 This is how I submitted the job (on the Master machine): $ spark-1.2.1-bin-hadoop2.4/bin/spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --executor-memory 256m --master spark://ip-10-171-32-62:7077 wc-spark/target/sparkwordcount-0.0.1-SNAPSHOT.jar /usr/share/dict/words 0 Any help would be greatly appreciated. Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/