Re: Core allocation is scattered

Muthu Jayakumar Wed, 31 Jul 2019 07:17:03 -0700

>I am running a spark job with 20 cores but i did not understand why my
application get 1-2 cores on couple of machines why not it just run on two
nodes like node1=16 cores and node 2=4 cores . but cores are allocated like
node1=2 node =1---------node 14=1 like that.


I believe that's the intended behavior for spark. Please refer to
https://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts
section on 'spark.deploy.spreadOut' mode.If I understand correctly, you may
want " spark.deploy.spreadOut  false".

Hope it helps!

Happy Spark(ing).

On Thu, Jul 25, 2019 at 7:22 PM Srikanth Sriram <
sriramsrikanth1...@gmail.com> wrote:

> Hello,
>
> Below is my understanding.
>
> The default configuration parameters which will be considered by the spark
> job if these are not configured at the time of submitting job to the
> required values.
>
> # - SPARK_EXECUTOR_INSTANCES, Number of workers to start (Default: 2)
> # - SPARK_EXECUTOR_CORES, Number of cores for the workers (Default: 1).
> # - SPARK_EXECUTOR_MEMORY, Memory per Worker (e.g. 1000M, 2G) (Default: 1G)
>
> SPARK_EXECUTOR_INSTANCES -> indicates the number of workers to be started,
> it means for a job maximum this many number of executors it can ask/take
> from the cluster resource manager.
>
> SPARK_EXECUTOR_CORES -> indicates the number of cores in each executor, it
> means the spark TaskScheduler will ask this many cores to be
> allocated/blocked in each of the executor machine.
>
> SPARK_EXECUTOR_MEMORY -> indicates the maximum amount of RAM/MEMORY it
> requires in each executor.
>
> All these details are asked by the TastScheduler to the cluster manager
> (it may be a spark standalone, yarn, mesos and can be kubernetes supported
> starting from spark 2.0) to provide before actually the job execution
> starts.
>
> Also, please note that, initial number of executor instances is dependent
> on "--num-executors" but when the data is more to be processed and
> "spark.dynamicAllocation.enabled" set true, then it will be dynamically add
> more executors based on "spark.dynamicAllocation.initialExecutors".
>
> Note: Always "spark.dynamicAllocation.initialExecutors" should be
> configured greater than "--num-executors".
> spark.dynamicAllocation.initialExecutors
> spark.dynamicAllocation.minExecutors Initial number of executors to run
> if dynamic allocation is enabled.
>
> If `--num-executors` (or `spark.executor.instances`) is set and larger
> than this value, it will be used as the initial number of executors.
> spark.executor.memory 1g Amount of memory to use per executor process, in
> the same format as JVM memory strings with a size unit suffix ("k", "m",
> "g" or "t") (e.g. 512m, 2g).
> spark.executor.cores 1 in YARN mode, all the available cores on the
> worker in standalone and Mesos coarse-grained modes. The number of cores
> to use on each executor. In standalone and Mesos coarse-grained modes, for
> more detail, see this description
> <http://spark.apache.org/docs/latest/spark-standalone.html#Executors%20Scheduling>
> .
>
> On Thu, Jul 25, 2019 at 5:54 PM Amit Sharma <resolve...@gmail.com> wrote:
>
>> I have cluster with 26 nodes having 16 cores on each. I am running a
>> spark job with 20 cores but i did not understand why my application get 1-2
>> cores on couple of machines why not it just run on two nodes like node1=16
>> cores and node 2=4 cores . but cores are allocated like node1=2 node
>> =1---------node 14=1 like that. Is there any conf property i need to
>> change. I know with dynamic allocation we can use below but without dynamic
>> allocation is there any?
>> --conf "spark.dynamicAllocation.maxExecutors=2"
>>
>>
>> Thanks
>> Amit
>>
>
>
> --
> Regards,
> Srikanth Sriram
>

Re: Core allocation is scattered

Reply via email to