Thanks for the response. I was just thinking why some of the design decisions were made with MRv2.
> No, the OR condition is implied by the hierarchy of requests (node, rack, *). If InputSplit1 is on Node11 and Node12 and InputSplit2 on Node21 and Node22. Then the AM can ask for 1 containers on each of the nodes and * as 2 for map tasks. Then the RM can return 2 nodes on Node11 and make * as 0. The data locality is lost for InputSplit2 or else the AM has to make another call to RM releasing one of the container and asking for another container. A bit more complex request specifying the dependencies might be more effective. > NM doesn't make any 'out' calls to anyone by RM, else it would be severe scalability bottleneck. There is already a one-way communication between the AM and NM for launching the containers. The response can from the NM can hold the list of completed containers from the previous call. > All interactions (RPCs) are authenticated. Also, there is a container token provided by the RM (during allocation) which is verified by the NM during container launch. So, a shared key has to be deployed manually on all the nodes for the NM? Regards, Praveen On Sun, Jan 8, 2012 at 12:08 AM, Arun C Murthy <a...@hortonworks.com> wrote: > > On Jan 5, 2012, at 8:29 AM, Praveen Sripati wrote: > > Hi, > > I had been going through the MRv2 documentation and have the following > queries > > 1) Let's say that an InputSplit is on Node1 and Node2. > > Can the ApplicationMaster ask the ResourceManager for a container either > on Node1 or Node2 with an OR condition? > > > No, the OR condition is implied by the hierarchy of requests (node, rack, > *). > > In this case, assuming topology is node1/rack1 node2/rack1, requests would > be: > node1 -> 1 > node2 -> 1 > rack1 -> 1 > * -> 1 > > OTOH, if the topology is node1/rack1, node2/rack2, requests would be: > node1 -> 1 > node2 -> 1 > rack1 -> 1 > rack2 -> 1 > * -> 1 > > In both cases, * would limit the #allocated-containers to '1'. > > In the first case rack1 itself (independent of *) would limit > #allocated-containers to 1. > > More details here: > http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/ > . > > I'll work on incorporating this into our docs on hadoop.apache.org. > > 2) > The Scheduler receives periodic information about the resource usages > on allocated resources from the NodeManagers. The Scheduler also makes > available status of completed Containers to the appropriate > ApplicationMaster. > > What's the use of NM sending the resource usages to the scheduler? > > Why can't the NM directly talk to the AM about the completed containers? > Does any information pass from NM to AM? > > > The NM sends resource usages to the scheduler to allow it to track > resource utilization on each node and, in future, make smarter decisions > about allocating extra containers on under-utilized nodes etc. > > NM doesn't make any 'out' calls to anyone by RM, else it would be severe > scalability bottleneck. > > 3) >The Map-Reduce ApplicationMaster has the following components: > > TaskUmbilical – The component responsible for receiving heartbeats and > status updates form the map and reduce tasks. > > Does the communication happen directly between the container and the AM?If > yes, the task completion status could also be sent from the container to > the AM. > > > Yes, it already is sent to AM. > > 4) > The Hadoop Map-Reduce JobClient polls the ASM to obtain information > about the MR AM and then directly talks to the AM for status, counters etc. > > Once the Job is completed the AM goes down, what happens to the Counters? > What is the flow of the Counter (Container -> NM -> AM)? > > > Once jobs completes the Counters etc. are stored in JobHistory file (one > per job) which is served up, if necessary, by the JobHistoryServer. > > 5) If a new YARN application is created. How can the NM trust the request > from AM? > > > All interactions (RPCs) are authenticated. Also, there is a container > token provided by the RM (during allocation) which is verified by the NM > during container launch. > > 6) > MapReduce NextGen uses wire-compatible protocols to allow different > versions of servers and clients to communicate with each other. > > What is meant by the `wire-compatible protocols` and how is it > implemented? > > > We use PB everywhere. > > 7) > The computation framework (ResourceManager and NodeManager) is > completely generic and is free of MapReduce specificities. > > Is this the reason for adding auxiliary services for shuffling to the NM? > > > Yes. > > hth, > Arun >