Hi Mohit, answers inline
On Fri, Sep 20, 2013 at 1:33 AM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > I am going through the concepts of resource manager, application master > and node manager. As I undersand resource manager receives the job > submission and launches application master. It also launches node manager > to monitor application master. My questions are: > > 1. Is Node manager long lived and that one node manager monitors all the > containers launed on the data nodes? > Correct > 2. How is resource negotiation done between the application master and the > resource manager? In other words what happens during this step? Does > resource manager looks at the active and pending tasks and resources > consumed by those before giving containers to the application master? > The ResourceManager contains a pluggable scheduler that is responsible for deciding which applications to give resources to when they become available. When a NodeManager heartbeats to the ResourceManager, the scheduler will decide whether there are any containers it should place on that node for an application, and will let the Application Master know about its decision on the next AM-RM heartbeat. Here's documentation for the two recommended schedulers: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html > 3. As it happens in old map reduce cluster that task trackers sends > periodic heartbeats to the job tracker nodes. How does this compare to > YARN? It looks like application master is a task tracker? Little confused > here. > The analog to this is the NodeManager sending periodic heartbeats to the ResourceManager. The Application Master also sends periodic heartbeats to the NodeManagers that its containers are running on to check on their status. > 4. It looks like client polls application master to get the progress of > the job but initially client connects to the resource manager. How does > client gets reference to the application master? Does it mean that client > gets the node ip/port from resource manager where application master was > launced by the resource manager? > Correct