Re: question about when do resource matching in YARN

2013-09-24 Thread Steve Loughran
On 21 September 2013 09:19, Sandy Ryza sandy.r...@cloudera.com wrote:

 I don't believe there is any reason scheduling decisions need to be coupled
 with NodeManager heartbeats.  It doesn't sidestep any race conditions
 because a NodeManager could die immediately after heartbeating.


historically its been done for scale: you don't need the JT reaching out to
4K TT's just to give them work to do, instead let them connect in anyway
and get work that way. And once they start reporting in completion then
they can get given more work. It's very biased towards worker nodes talk
to the master over master approaches workers

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: question about when do resource matching in YARN

2013-09-24 Thread Harsh J
Yes, but the heartbeat coupling isn't necessary I think. One could
even use ZK write/watch approach for faster assignment of regular
work?

On Tue, Sep 24, 2013 at 2:24 PM, Steve Loughran ste...@hortonworks.com wrote:
 On 21 September 2013 09:19, Sandy Ryza sandy.r...@cloudera.com wrote:

 I don't believe there is any reason scheduling decisions need to be coupled
 with NodeManager heartbeats.  It doesn't sidestep any race conditions
 because a NodeManager could die immediately after heartbeating.


 historically its been done for scale: you don't need the JT reaching out to
 4K TT's just to give them work to do, instead let them connect in anyway
 and get work that way. And once they start reporting in completion then
 they can get given more work. It's very biased towards worker nodes talk
 to the master over master approaches workers

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



-- 
Harsh J


Re: question about when do resource matching in YARN

2013-09-24 Thread Sandy Ryza
How would the ZK approach make things faster?  Are you saying the AMs would
do the watching?  Currently containers assignments aren't actually sent to
the NodeManagers on heartbeats.  The first time a NM hears about a
container is when an AM launches it.


On Tue, Sep 24, 2013 at 4:12 AM, Harsh J ha...@cloudera.com wrote:

 Yes, but the heartbeat coupling isn't necessary I think. One could
 even use ZK write/watch approach for faster assignment of regular
 work?

 On Tue, Sep 24, 2013 at 2:24 PM, Steve Loughran ste...@hortonworks.com
 wrote:
  On 21 September 2013 09:19, Sandy Ryza sandy.r...@cloudera.com wrote:
 
  I don't believe there is any reason scheduling decisions need to be
 coupled
  with NodeManager heartbeats.  It doesn't sidestep any race conditions
  because a NodeManager could die immediately after heartbeating.
 
 
  historically its been done for scale: you don't need the JT reaching out
 to
  4K TT's just to give them work to do, instead let them connect in anyway
  and get work that way. And once they start reporting in completion then
  they can get given more work. It's very biased towards worker nodes talk
  to the master over master approaches workers
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.



 --
 Harsh J



Re: question about when do resource matching in YARN

2013-09-21 Thread Sandy Ryza
I don't believe there is any reason scheduling decisions need to be coupled
with NodeManager heartbeats.  It doesn't sidestep any race conditions
because a NodeManager could die immediately after heartbeating.


On Sat, Sep 21, 2013 at 2:11 AM, Omkar Joshi ojo...@hortonworks.com wrote:

 Hi Wei,

 Yes there is a clear lag between AM requesting resource and satisfying NM
 heartbeats (thereby we process the event) are received. Developers in
 project Tez ( http://incubator.apache.org/projects/tez.html ) have done
 some similar stuff. You can check it there. I hope it helps.

 Thanks,
 Omkar Joshi
 *Hortonworks Inc.* http://www.hortonworks.com


 On Fri, Sep 20, 2013 at 8:56 AM, Xuan Gong xg...@hortonworks.com wrote:

  Hey, Wei:
The nodeHeartBeat is used to let RM knows this NM is still alive.
 We
  only assign containers from alive NM. Another thing is when scheduler
  receives the nodeHeartBeat, the scheduler will get the container status
  (such as completed, new launched) from NM, and it can use it to update
 the
  resource.
 
 You can take a look those source codes, it can help you understand
  better.
  1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
  nodeheartbeat
  2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
  heartbeat from NM, and send to RMNodeImpl
  3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat,
 and
  do locally update.
  4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
  potentially assign containers.
 
  Thanks
 
  Xuan
 
 
  On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail ywsk...@gmail.com
  wrote:
 
   Hi, all,
  
   I have a simple question. Currently in YARN, the resource matching is
   triggered by the node manager heartbeat. That is, assignContainers() is
   only invoked when a new heartbeat comes in. Why we don't use resource
   request triggered mechanism? That is, when AM submits allocateRequest,
 we
   do the resource matching and assign containers.
  
   Does anybody have any idea about this?
  
   thanks,
   Wei
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: question about when do resource matching in YARN

2013-09-20 Thread Xuan Gong
Hey, Wei:
  The nodeHeartBeat is used to let RM knows this NM is still alive. We
only assign containers from alive NM. Another thing is when scheduler
receives the nodeHeartBeat, the scheduler will get the container status
(such as completed, new launched) from NM, and it can use it to update the
resource.

   You can take a look those source codes, it can help you understand
better.
1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
nodeheartbeat
2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
heartbeat from NM, and send to RMNodeImpl
3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat, and
do locally update.
4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
potentially assign containers.

Thanks

Xuan


On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail ywsk...@gmail.com wrote:

 Hi, all,

 I have a simple question. Currently in YARN, the resource matching is
 triggered by the node manager heartbeat. That is, assignContainers() is
 only invoked when a new heartbeat comes in. Why we don't use resource
 request triggered mechanism? That is, when AM submits allocateRequest, we
 do the resource matching and assign containers.

 Does anybody have any idea about this?

 thanks,
 Wei

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: question about when do resource matching in YARN

2013-09-20 Thread Omkar Joshi
Hi Wei,

Yes there is a clear lag between AM requesting resource and satisfying NM
heartbeats (thereby we process the event) are received. Developers in
project Tez ( http://incubator.apache.org/projects/tez.html ) have done
some similar stuff. You can check it there. I hope it helps.

Thanks,
Omkar Joshi
*Hortonworks Inc.* http://www.hortonworks.com


On Fri, Sep 20, 2013 at 8:56 AM, Xuan Gong xg...@hortonworks.com wrote:

 Hey, Wei:
   The nodeHeartBeat is used to let RM knows this NM is still alive. We
 only assign containers from alive NM. Another thing is when scheduler
 receives the nodeHeartBeat, the scheduler will get the container status
 (such as completed, new launched) from NM, and it can use it to update the
 resource.

You can take a look those source codes, it can help you understand
 better.
 1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
 nodeheartbeat
 2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
 heartbeat from NM, and send to RMNodeImpl
 3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat, and
 do locally update.
 4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
 potentially assign containers.

 Thanks

 Xuan


 On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail ywsk...@gmail.com
 wrote:

  Hi, all,
 
  I have a simple question. Currently in YARN, the resource matching is
  triggered by the node manager heartbeat. That is, assignContainers() is
  only invoked when a new heartbeat comes in. Why we don't use resource
  request triggered mechanism? That is, when AM submits allocateRequest, we
  do the resource matching and assign containers.
 
  Does anybody have any idea about this?
 
  thanks,
  Wei

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.