Hi, I'm running spark 1.6.1 on a single machine, initially a small one (8 cores, 16GB ram) using "--master local[*]" to spark-submit and I'm trying to see scaling with increasing cores, unsuccessfully. Initially I'm setting SPARK_EXECUTOR_INSTANCES=1, and increasing cores for each executor. The way I'm setting cores per executor is either with "SPARK_EXECUTOR_CORES=1" (up to 4) and I also tried with " --conf "spark.executor.cores=1 spark.executor.memory=9g". I'm repartitioning the RDD of the large dataset into 4/8/10 partitions for different runs.
Am I setting executors/cores correctly for running Spark 1.6 locally/Standalone mode ? The logs show the same overall timings for execution of the key stages (within a stage I see the number of tasks match the data partitioning value) whether I'm setting for 1, 4 or 8 cores per executor. And the process table looks like the requested cores aren't being used. I know eg. "--num.executors=X" is only an argument to Yarn. I can't find specific instructions in one place for settings these params (executors/cores) on Spark running on one machine. An example of my full spark-submit command is: SPARK_EXECUTOR_INSTANCES=1 SPARK_EXECUTOR_CORES=4 spark-submit --master local[*] --conf "spark.executor.cores=4 spark.executor.memory=9g" --class asap.examples.mllib.TfIdfExample /home/ubuntu/spark-1.6.1-bin-hadoop2.6/asap_ml/target/scala-2.10/ml-operators_2.10-1.0.jar Duplicated settings here but it shows the different ways I've been setting the parameters. Thanks Karen -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Correct-way-of-setting-executor-numbers-and-executor-cores-in-Spark-1-6-1-for-non-clustered-mode-tp26894.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org