Hi Rory, for starters what version of Spark are you using? I believe that in a 1.5.? release (I don't know which one off the top of my head) there was an addition that would also display the config property when a timeout happened. That might help some if you are able to upgrade.
On Jul 18, 2016 9:34 AM, "Rory Waite" <rwa...@sdl.com> wrote: > Hi All, > > We have created a regression test for a spark job that is executed during > our automated build. It executes a spark-submit with a local master, > processes some data, and the exits. We have an issue in that we get a > non-deterministic timeout error. It seems to be when the spark context > tries to initialise Akka (stack trace below). It doesn't happen often, but > when it does it causes the whole build to fail. > > The machines that run these tests get very heavily loaded, with many > regression tests running simultaneously. My theory is that the spark-submit > is sometimes unable to initialise Akka in time because the machines are so > heavily loaded with the other tests. My first thought was to try to tune > some parameter to extend the timeout, but I couldn't find anything in the > documentation. The timeout is short at 10s, whereas the default akka > timeout is set at 100s. > > Is there a way to adjust this timeout? > > 16/07/17 00:04:22 ERROR SparkContext: Error initializing SparkContext. > java.util.concurrent.TimeoutException: Futures timed out after [10000 > milliseconds] > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) > at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) > at > scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) > at scala.concurrent.Await$.result(package.scala:107) > at akka.remote.Remoting.start(Remoting.scala:179) > at > akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184) > at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:620) > at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:617) > at akka.actor.ActorSystemImpl._start(ActorSystem.scala:617) > at akka.actor.ActorSystemImpl.start(ActorSystem.scala:634) > at akka.actor.ActorSystem$.apply(ActorSystem.scala:142) > at akka.actor.ActorSystem$.apply(ActorSystem.scala:119) > at > org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121) > at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) > at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52) > at > org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955) > at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55) > at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266) > at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) > at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:457) > at com.sdl.nntrainer.NNTrainer$.main(NNTrainer.scala:418) > at com.sdl.nntrainer.NNTrainer.main(NNTrainer.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 16/07/17 00:04:22 INFO RemoteActorRefProvider$RemotingTerminator: Shutting > down remote daemon. > 16/07/17 00:04:22 INFO SparkContext: Successfully stopped SparkContext > Exception in thread "main" java.util.concurrent.TimeoutException: Futures > timed out after [10000 milliseconds] > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) > at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) > at > scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) > at scala.concurrent.Await$.result(package.scala:107) > at akka.remote.Remoting.start(Remoting.scala:179) > at > akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184) > at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:620) > at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:617) > at akka.actor.ActorSystemImpl._start(ActorSystem.scala:617) > at akka.actor.ActorSystemImpl.start(ActorSystem.scala:634) > at akka.actor.ActorSystem$.apply(ActorSystem.scala:142) > at akka.actor.ActorSystem$.apply(ActorSystem.scala:119) > at > org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121) > at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) > at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52) > at > org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955) > at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55) > at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266) > at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) > at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:457) > at com.sdl.nntrainer.NNTrainer$.main(NNTrainer.scala:418) > at com.sdl.nntrainer.NNTrainer.main(NNTrainer.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > <http://www.sdl.com/> > www.sdl.com > > > SDL PLC confidential, all rights reserved. If you are not the intended > recipient of this mail SDL requests and requires that you delete it without > acting upon or copying any of its contents, and we further request that you > advise us. > > SDL PLC is a public limited company registered in England and Wales. > Registered number: 02675207. > Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 > 7DY, UK. > > > This message has been scanned for malware by Websense. www.websense.com >