That's correct. The previous behavior of the MR AM was tightly coupled to the scheduler impl and thus fragile. The RM is supposed to not give a container less than it needs because that's incorrect. It can always give a container more than it needs based on its internal heuristics. Ideally that should not be the case to prevent internal fragmentation.
Alejandro, after MAPREDUCE-5310 did we check that the MR AM works correctly after making the M/R memory different from the normalized values? Bikas -----Original Message----- From: Alejandro Abdelnur [mailto:[email protected]] Sent: Tuesday, June 18, 2013 10:59 AM To: [email protected] Subject: Re: Container size configuration Bobby, With MAPREDUCE-5310 we removed normalization of resource request on the MRAM side. This was done because the normalization is an implementation detail of the RM scheduler. IMO, if this is a problem for the MRAM as you suggest, then we should fix the MRAM logic. Note this may happen only the MR job specifies memory requirements for its tasks that do not much with its normalize value. Thanks. On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <[email protected]> wrote: > Even returning an over sized container can be very confusing for an > application. The MR AM will not handle it correctly. If it sees a > container returned that does not match exactly the priority and size > it expects, I believe that container is thrown away. We had deadlocks > in the past where it somehow used a reducer container for a mapper and > then never updated the reducer count to request a new one. It is best > for now to not mix the two, and we need to lock down/fix the semantics > of what happens in those situations for a scheduler. > > --Bobby > > On 6/18/13 12:13 AM, "Bikas Saha" <[email protected]> wrote: > > >I think the API allows different size requests at the same priority. > >The implementation of the scheduler drops the size information and > >uses the last value set. We should probably at least change it to use > >the largest value used so that users don't get containers that are too small for them. > >YARN-847 tracks this. > > > >Bikas > > > >-----Original Message----- > >From: Robert Evans [mailto:[email protected]] > >Sent: Friday, June 14, 2013 7:09 AM > >To: [email protected] > >Subject: Re: Container size configuration > > > >Is this specifically for YARN? If so, yes you can do this, MR does > >this for Maps vs Reduces. The API right now requires that the > >different sized containers have a different priority. > > > > > http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-sit > e/Wr > >i > >tingYarnApplications.html > > > >Shows how to make a resource request. It also shows how to make a > >AllocateRequest. If you put in multiple ResourceRequests into the > >AllocateRequest it will allocate both of them. But remember that > >that the priority needs to be different, and the priority determines > >the order in which the containers will be allocated to your application. > > > >--Bobby > > > >On 6/13/13 10:41 AM, "Yuzhang Han" <[email protected]> wrote: > > > >>Hi, > >> > >>I am wondering if I can allocate different size of containers to the > >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = > >>Task2 = 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks. > >> > >>Yuzhang > > -- Alejandro
