I also encountered the similar problem: after some stages, all the taskes are assigned to one machine, and the stage execution get slower and slower.
*[the spark conf setting]* val conf = new SparkConf().setMaster(sparkMaster).setAppName("ModelTraining" ).setSparkHome(sparkHome).setJars(List(jarFile)) conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") conf.set("spark.kryo.registrator", "LRRegistrator") conf.set("spark.storage.memoryFraction", "0.7") conf.set("spark.executor.memory", "8g") conf.set("spark.cores.max", "150") conf.set("spark.speculation", "true") conf.set("spark.storage.blockManagerHeartBeatMs", "300000") val sc = new SparkContext(conf) val lines = sc.textFile("hdfs://xxx:52310"+inputPath , 3) val trainset = lines.map(parseWeightedPoint).repartition(50 ).persist(StorageLevel.MEMORY_ONLY) *[the warn log from the spark]* 14/09/19 10:26:23 WARN TaskSetManager: Loss was due to fetch failure from BlockManagerId(45, TS-BH109, 48384, 0) 14/09/19 10:27:18 WARN TaskSetManager: Lost TID 726 (task 14.0:9) 14/09/19 10:29:03 WARN SparkDeploySchedulerBackend: Ignored task status update (737 state FAILED) from unknown executor Actor[akka.tcp://sparkExecutor@TS-BH96:33178/user/Executor#-913985102] with ID 39 14/09/19 10:29:03 WARN TaskSetManager: Loss was due to fetch failure from BlockManagerId(30, TS-BH136, 28518, 0) 14/09/19 11:01:22 WARN BlockManagerMasterActor: Removing BlockManager BlockManagerId(47, TS-BH136, 31644, 0) with no recent heart beats: 47765ms exceeds 45000ms Any suggestions? On Thu, Sep 18, 2014 at 4:46 PM, shishu <shi...@zamplus.com> wrote: > Hi dear all~ > > My spark application sometimes runs much slower than it use to be, so I > wonder why would this happen. > > I find out that after a repartition stage of stage 17, all tasks go to one > executor. But in my code, I only use repartition at the very beginning. > > In my application, before stage 17, every stage run sucessfully within 1 > minute, but after stage 17, it cost more than 10 minutes for every stage. > Normally my application runs succcessfully and will finish within 9 minites. > > My spark version is 0.9.1, and my program is writen by scala. > > > > I take some screen-shots, you can see it in the archive. > > > > Great thanks if you can help~ > > > > Shi Shu > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >