Re: Spark tunning increase number of active tasks

2015-10-31 Thread Sandy Ryza
Hi Xiaochuan, The most likely cause of the "Lost container" issue is that YARN is killing container for exceeding memory limits. If this is the case, you should be able to find instances of "exceeding memory limits" in the application logs. http://blog.cloudera.com/blog/2015/03/how-to-tune-your-

Re: Spark tunning increase number of active tasks

2015-10-31 Thread Jörn Franke
Maybe Hortonworks support can help you much better. Otherwise you may want to change the yarn scheduler configuration and preemption. Do you use something like speculative execution? How do you start execution of the programs? Maybe you are already using all cores of the master... > On 30 Oct

RE: Spark tunning increase number of active tasks

2015-10-30 Thread YI, XIAOCHUAN
Hi Our team has a 40 node hortonworks Hadoop cluster 2.2.4.2-2 (36 data node) with apache spark 1.2 and 1.4 installed. Each node has 64G RAM and 8 cores. We are only able to use <= 72 executors with executor-cores=2 So we are only get 144 active tasks running pyspark programs with pyspark. [Stag

RE: Spark tunning increase number of active tasks

2015-10-30 Thread YI, XIAOCHUAN
HI Our team has a 40 node hortonworks Hadoop cluster 2.2.4.2-2 (36 data node) with apache spark 1.2 and 1.4 installed. Each node has 64G RAM and 8 cores. We are only able to use <= 72 executors with executor-cores=2 So we are only get 144 active tasks running pyspark programs with pyspark. [Sta