I have googled more about it, and it seems like two parameters should define the "bin packing problem". According to https://hadoop.apache.org/docs/r2.9.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Other_Properties yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled is by default set to true and with parameter yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments r set to -1 it can assign all the containers the Node manager "said" it is capable of (which could somehow explain the bin packing problem for the first Nodemanager who answer with a Heartbeat message). Following Apache's instructions, I have inserted to my *capacity-scheduler.xml* in hadoop/etc/hadoop folder
<property> <name>yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled</name> <value>true</value> <description> Whether to allow multiple container assignments in one NodeManager heartbeat. Defaults to true. </description> </property> <property> <name>yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments</name> <value>2</value> <description> If multiple-assignments-enabled is true, the maximum amount of containers that can be assigned in one NodeManager heartbeat. Defaults to -1, which sets no limit. </description> </property> I have checked the configuration file, and I am using the capacity scheduler (I have enabled yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled again just to be sure). Furthermore, after I have run "yarn rmadmin -refreshQueues" I haven't seen any change in the Mappers allocation nor Reducers. hadoop2@master:~$ yarn rmadmin -refreshQueues 19/01/10 16:06:33 INFO client.RMProxy: Connecting to ResourceManager at master/172.31.24.83:8033 What am I missing over here? Or בתאריך יום ד׳, 9 בינו׳ 2019 ב-23:57 מאת Or Raz <r...@post.bgu.ac.il >: > Thanks for the tips! > Because I haven't set any scheduler (on purpose) for YARN then, I am using > the default one (Capacity). > I have looked in yarn-site.xml and in the configuration tab (using > JobHistory UI), and both of the parameters that you have mentioned weren't > there (so they haven't been set). > You said that I should look at "locality settings" can you be more > specific on what and where to look? > Also, it is worth mentioning that I am using three computers and the > replication factor (of HDFS) is three too. Thus, every data (even input) > would be on every computer, and the memory of each computer is the same > (two t2.xlarge and one m4.xlarge) while I am > using DefaultResourceCalculator. > > Or > > בתאריך יום ד׳, 9 בינו׳ 2019 ב-23:28 מאת Aaron Eng <a...@mapr.com>: > >> The settings are very relevant to having an equal number of containers >> running on each node if you have an idle cluster and want to distribute >> containers for a single job. An application master submits requests for >> container allocations to the ResourceManager. The MRAppMaster will request >> all the map containers at once, the FairScheduler will find NodeManagers >> with capacity to fulfill the container requests. If assign multiple is >> enabled then you generally won't get an even number of containers assigned >> to each node +/- 1 container. Before you say it's not relevant, you should >> check if your environment uses the FairScheduler and whether multiple >> assignment is enabled. If so, that's likely why there isn't an even >> assignment +/- 1 container. If not using FairScheduler and/or multiple >> assign, then you should look at locality settings, which can cause >> containers to be preferentially run on a subset of nodes, resulting in an >> uneven container assignment per node. >> >> On Wed, Jan 9, 2019 at 2:19 PM Or Raz <r...@post.bgu.ac.il> wrote: >> >>> As far as I know, the scheduler in YARN is only scheduling the jobs and >>> not the containers inside each job. Therefore, I don't believe it is >>> relevant. >>> Also, I haven't used or set those two parameters, and I haven't picked >>> nor set any particular schedule for my research (Fair, FIFO or Capacity). >>> Please correct if I am wrong. >>> P.S. currently I have no interest in a situation when I run a few jobs >>> concurrently, my case is much simpler with one job that I would like that >>> allocation of containers will be more balanced... >>> Or >>> >>> >>> בתאריך יום ד׳, 9 בינו׳ 2019 ב-19:11 מאת Aaron Eng <a...@mapr.com >>> >: >>> >>>> Have you checked the yarn.scheduler.fair.assignmultiple >>>> and yarn.scheduler.fair.max.assign parameters for the ResourceManager >>>> configuration? >>>> >>>> On Wed, Jan 9, 2019 at 9:49 AM Or Raz <r...@post.bgu.ac.il> wrote: >>>> >>>>> How can I change/suggest a different allocation of containers to tasks >>>>> in Hadoop? Regarding a native Hadoop (2.9.1) cluster on AWS. >>>>> >>>>> I am running a native Hadoop cluster (2.9.1) on AWS (with EC2, not >>>>> EMR) and I want the scheduling/allocating of the containers >>>>> (Mappers/Reducers) would be more balanced than it is currently. It seems >>>>> like RM is assigning the Mappers in a Bin Packing way (where the data >>>>> resides) and for the reducers, it looks more balanced. My setup includes >>>>> three Machines with replication rate three (all the data is on every >>>>> machine), and I run my jobs with >>>>> mapreduce.job.reduce.slowstart.completedmaps=0 to start shuffle as fast as >>>>> possible (It is vital for me that all the containers are working in >>>>> concurrency, it is a must condition). Also, according to the EC2 instances >>>>> I have chosen and my settings of the YARN cluster, I can run at most 93 >>>>> containers (31 each). >>>>> >>>>> For example, if I want to have nine reducers then (93-9-1=83), 83 >>>>> containers could be left for the mappers, and one is for the AM. I have >>>>> played with the size of split input >>>>> (mapreduce.input.fileinputformat.split.minsize, >>>>> mapreduce.input.fileinputformat.split.maxsize) to find the right balance >>>>> where all of the machines have the same "work" for the map phase. But it >>>>> seems like the first 31 mappers would be allocated in one computer, the >>>>> next 31 to the second one and the last 31 in the last machine. Thus, I can >>>>> try to use 87 mappers where 31 of them in Machine #1, another 31 in >>>>> Machine >>>>> #2 and another 25 in Machine #3 and the rest is left for the reducers and >>>>> as Machine #1 and Machine #2 are fully occupied then the reducers would >>>>> have to be placed in Machine #3. This way I get an almost balanced >>>>> allocation of mappers at the expense of unbalanced reducers allocation. >>>>> And >>>>> this is not what I want... >>>>> >>>>> # of mappers = size_input / split size [Bytes] >>>>> >>>>> split size >>>>> =max(mapreduce.input.fileinputformat.split.minsize,min(mapreduce.input.fileinputformat.split.maxsize, >>>>> dfs.blocksize)) >>>>> >>>>