yeah, for some reason (unknown to me, but you can find on aws forums) they double the actual number of cores for nodemanagers.
I assume that's done to maximize utilization, but doesn't really matter to me, at least, since I only run Spark, so I, personally, set `total number of cores - 1/2` saving one core for the OS/DataNode/NodeManager, because Spark itself can create a significant load. On Mon, Feb 26, 2018 at 4:51 PM, Selvam Raman <sel...@gmail.com> wrote: > Thanks. That’s make sense. > > I want to know one more think , available vcore per machine is 16 but > threads per node 8. Am I missing to relate here. > > What I m thinking now is number of vote = number of threads. > > > > On Mon, 26 Feb 2018 at 18:45, Vadim Semenov <va...@datadoghq.com> wrote: > >> All used cores aren't getting reported correctly in EMR, and YARN itself >> has no control over it, so whatever you put in `spark.executor.cores` will >> be used, >> but in the ResourceManager you will only see 1 vcore used per nodemanager. >> >> On Mon, Feb 26, 2018 at 5:20 AM, Selvam Raman <sel...@gmail.com> wrote: >> >>> Hi, >>> >>> spark version - 2.0.0 >>> spark distribution - EMR 5.0.0 >>> >>> Spark Cluster - one master, 5 slaves >>> >>> Master node - m3.xlarge - 8 vCore, 15 GiB memory, 80 SSD GB storage >>> Slave node - m3.2xlarge - 16 vCore, 30 GiB memory, 160 SSD GB storage >>> >>> >>> Cluster Metrics >>> Apps SubmittedApps PendingApps RunningApps CompletedContainers RunningMemory >>> UsedMemory TotalMemory ReservedVCores UsedVCores TotalVCores ReservedActive >>> NodesDecommissioning NodesDecommissioned NodesLost NodesUnhealthy >>> NodesRebooted >>> Nodes >>> 16 0 1 15 5 88.88 GB 90.50 GB 22 GB 5 79 1 5 >>> <http://localhost:8088/cluster/nodes> 0 >>> <http://localhost:8088/cluster/nodes/decommissioning> 0 >>> <http://localhost:8088/cluster/nodes/decommissioned> 5 >>> <http://localhost:8088/cluster/nodes/lost> 0 >>> <http://localhost:8088/cluster/nodes/unhealthy> 0 >>> <http://localhost:8088/cluster/nodes/rebooted> >>> I have submitted job with below configuration >>> --num-executors 5 --executor-cores 10 --executor-memory 20g >>> >>> >>> >>> spark.task.cpus - be default 1 >>> >>> >>> My understanding is there will be 5 executore each can run 10 task at a >>> time and task can share total memory of 20g. Here, i could see only 5 >>> vcores used which means 1 executor instance use 20g+10%overhead ram(22gb), >>> 10 core(number of threads), 1 Vcore(cpu). >>> >>> please correct me if my understand is wrong. >>> >>> how can i utilize number of vcore in EMR effectively. Will Vcore boost >>> performance? >>> >>> >>> -- >>> Selvam Raman >>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து" >>> >> >> -- > Selvam Raman > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து" >