I don’t think the number of CPU cores controls the “number of parallel tasks”. The number of Tasks corresponds first and foremost to the number of (Dstream) RDD Partitions
The Spark documentation doesn’t mention what is meant by “Task” in terms of Standard Multithreading Terminology ie a Thread or Process so your point is good Ps: time and time again every product and dev team and company invent their own terminology so 50% of the time using the product is spent on deciphering and reinventing the wheel From: Mulugeta Mammo [mailto:mulugeta.abe...@gmail.com] Sent: Thursday, May 28, 2015 7:24 PM To: Ruslan Dautkhanov Cc: user Subject: Re: Value for SPARK_EXECUTOR_CORES Thanks for the valuable information. The blog states: "The cores property controls the number of concurrent tasks an executor can run. --executor-cores 5 means that each executor can run a maximum of five tasks at the same time. " So, I guess the max number of executor-cores I can assign is the CPU count (which includes the number of threads per core), not just the number of cores. I just want to be sure the "cores" term Spark is using. Thanks On Thu, May 28, 2015 at 11:16 AM, Ruslan Dautkhanov <dautkha...@gmail.com> wrote: It's not only about cores. Keep in mind spark.executor.cores also affects available memeory for each task: >From >http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ The memory available to each task is (spark.executor.memory * spark.shuffle.memoryFraction *spark.shuffle.safetyFraction)/spark.executor.cores. Memory fraction and safety fraction default to 0.2 and 0.8 respectively. I'd test spark.executor.cores with 2,4,8 and 16 and see what makes your job run faster.. -- Ruslan Dautkhanov On Wed, May 27, 2015 at 6:46 PM, Mulugeta Mammo <mulugeta.abe...@gmail.com> wrote: My executor has the following spec (lscpu): CPU(s): 16 Core(s) per socket: 4 Socket(s): 2 Thread(s) per code: 2 The CPU count is obviously 4*2*2 = 16. My question is what value is Spark expecting in SPARK_EXECUTOR_CORES ? The CPU count (16) or total # of cores (2 * 2 = 4) ? Thanks