No. I am not setting the number of executors anywhere (in env file or in program).
Is it due to large number of small files ? On Wed, May 20, 2015 at 5:11 PM, ayan guha <guha.a...@gmail.com> wrote: > What is your spark env file says? Are you setting number of executors in > spark context? > On 20 May 2015 13:16, "Shailesh Birari" <sbirar...@gmail.com> wrote: > >> Hi, >> >> I have a 4 node Spark 1.3.1 cluster. All four nodes have 4 cores and 64 GB >> of RAM. >> I have around 600,000+ Json files on HDFS. Each file is small around 1KB >> in >> size. Total data is around 16GB. Hadoop block size is 256MB. >> My application reads these files with sc.textFile() (or sc.jsonFile() >> tried >> both) API. But all the files are getting read by only one node (4 >> executors). Spark UI shows all 600K+ tasks on one node and 0 on other >> nodes. >> >> I confirmed that all files are accessible from all nodes. Some other >> application which uses big files uses all nodes on same cluster. >> >> Can you please let me know why it is behaving in such way ? >> >> Thanks, >> Shailesh >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Job-not-using-all-nodes-in-cluster-tp22951.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >>