Dear Rohith and Naga, Thank you very much for your quick responses, your information has proven very useful.
Cheers, George On 7 April 2015 at 07:08, Naganarasimha G R (Naga) < garlanaganarasi...@huawei.com> wrote: > Hi George, > > The current implementation present in YARN using Cgroups supports CPU > isolation but not by pinning to specific cores (Cgroup CPUsets) but based > on cpu cycles (quota & Period). > Admin is provided with an option of specifying how much percentage of CPU > can be used by YARN containers. And Yarn will take care of configuring > Cgroup Quota and Period files and > ensures only configured CPU percentage is only used by YARN containers > > Is there any particular need to pin the MR tasks to the specific cores ? > or you just want to ensure YARN is not using more than the specified > percentage of CPU in a give node ? > > Regards, > Naga > > ------------------------------ > *From:* Rohith Sharma K S [rohithsharm...@huawei.com] > *Sent:* Tuesday, April 07, 2015 09:23 > *To:* user@hadoop.apache.org > *Subject:* RE: Pin Map/Reduce tasks to specific cores > > Hi George > > > > In MRV2, YARN supports CGroups implementation. Using CGroup it is > possible to run containers in specific cores. > > > > For your detailed reference, some of the useful links > > > http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html > > > http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/ > > http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/ > > > > P.S : I could not find any related document in Hadoop Yarn docs. I will > raise ticket for the same in community. > > > > Hope the above information will help your use case!!! > > > > Thanks & Regards > > Rohith Sharma K S > > > > *From:* George Ioannidis [mailto:giorgio...@gmail.com] > *Sent:* 07 April 2015 01:55 > *To:* user@hadoop.apache.org > *Subject:* Pin Map/Reduce tasks to specific cores > > > > Hello. My question, which can be found on *Stack Overflow > <http://stackoverflow.com/questions/29283213/core-affinity-of-map-tasks-in-hadoop>* > as well, regards pinning map/reduce tasks to specific cores, either on > hadoop v.1.2.1 or hadoop v.2. > > In specific, I would like to know if the end-user can have any control on > which core executes a specific map/reduce task. > > To pin an application on linux, there's the "taskset" command, but is > anything similar provided by hadoop? If not, is the Linux Scheduler in > charge of allocating tasks to specific cores? > > > > ------------------ > > Below I am providing two cases to better illustrate my question: > > *Case #1:* 2 GiB input size, HDFS block size of 64 MiB and 2 compute > nodes available, with 32 cores each. > > As follows, 32 map tasks will be called; let's suppose that > mapred.tasktracker.map.tasks.maximum > = 16, so 16 map tasks will be allocated to each node. > > Can I guarantee that each Map Task will run on a specific core, or is it > up to the Linux Scheduler? > > ------------------ > > *Case #2:* The same as case #1, but now the input size is 8 GiB, so there > are not enough slots for all map tasks (128), so multiple tasks will share > the same cores. > > Can I control how much "time" each task will spend on a specific core and > if it will be reassigned to the same core in the future? > > Any information on the above would be highly appreciated. > > Kind Regards, > > George >