Hi all, I hava a standalone mode spark cluster without HDFS with 10 machines that each one has 40 cpu cores and 128G RAM. My application is a sparksql application that reads data from database "tpch_100g" in mysql and run tpch queries. When loading tables from myql to spark, I spilts the biggest table "lineitem" into 600 partitions.
When my application runs, there are only 40 executor(spark.executor.memory = 1g, spark.executor.cores = 1) in executor page of spark application web and all executors are on the same mathine. It is too slowly that all tasks are parallelly running in only one mathine.