Spark is only using one worker machine when more are available

宋源栋 Wed, 11 Apr 2018 02:10:54 -0700


Hi all,
I hava a standalone mode spark cluster without HDFS with 10 machines that each 
one has 40 cpu cores and 128G RAM.
My application is a sparksql application that reads data from database 
"tpch_100g" in mysql and run tpch queries. When loading tables from myql to 
spark, I spilts the biggest table "lineitem" into 600 partitions.


When my application runs, there are only 40 executor(spark.executor.memory = 
1g, spark.executor.cores = 1) in executor page of spark application web and all 
executors are on the same mathine. It is too slowly that all tasks are 
parallelly running in only one mathine.

Spark is only using one worker machine when more are available

Reply via email to