Re: question about when do resource matching in YARN

2013-09-24 Thread Sandy Ryza
How would the ZK approach make things faster?  Are you saying the AMs would
do the watching?  Currently containers assignments aren't actually sent to
the NodeManagers on heartbeats.  The first time a NM hears about a
container is when an AM launches it.


On Tue, Sep 24, 2013 at 4:12 AM, Harsh J  wrote:

> Yes, but the heartbeat coupling isn't necessary I think. One could
> even use ZK write/watch approach for faster assignment of regular
> work?
>
> On Tue, Sep 24, 2013 at 2:24 PM, Steve Loughran 
> wrote:
> > On 21 September 2013 09:19, Sandy Ryza  wrote:
> >
> >> I don't believe there is any reason scheduling decisions need to be
> coupled
> >> with NodeManager heartbeats.  It doesn't sidestep any race conditions
> >> because a NodeManager could die immediately after heartbeating.
> >>
> >>
> > historically its been done for scale: you don't need the JT reaching out
> to
> > 4K TT's just to give them work to do, instead let them connect in anyway
> > and get work that way. And once they start reporting in completion then
> > they can get given more work. It's very biased towards "worker nodes talk
> > to the master" over "master approaches workers"
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
>
>
> --
> Harsh J
>


Re: question about when do resource matching in YARN

2013-09-24 Thread Harsh J
Yes, but the heartbeat coupling isn't necessary I think. One could
even use ZK write/watch approach for faster assignment of regular
work?

On Tue, Sep 24, 2013 at 2:24 PM, Steve Loughran  wrote:
> On 21 September 2013 09:19, Sandy Ryza  wrote:
>
>> I don't believe there is any reason scheduling decisions need to be coupled
>> with NodeManager heartbeats.  It doesn't sidestep any race conditions
>> because a NodeManager could die immediately after heartbeating.
>>
>>
> historically its been done for scale: you don't need the JT reaching out to
> 4K TT's just to give them work to do, instead let them connect in anyway
> and get work that way. And once they start reporting in completion then
> they can get given more work. It's very biased towards "worker nodes talk
> to the master" over "master approaches workers"
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.



-- 
Harsh J


Re: question about when do resource matching in YARN

2013-09-24 Thread Steve Loughran
On 21 September 2013 09:19, Sandy Ryza  wrote:

> I don't believe there is any reason scheduling decisions need to be coupled
> with NodeManager heartbeats.  It doesn't sidestep any race conditions
> because a NodeManager could die immediately after heartbeating.
>
>
historically its been done for scale: you don't need the JT reaching out to
4K TT's just to give them work to do, instead let them connect in anyway
and get work that way. And once they start reporting in completion then
they can get given more work. It's very biased towards "worker nodes talk
to the master" over "master approaches workers"

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: question about when do resource matching in YARN

2013-09-21 Thread Sandy Ryza
I don't believe there is any reason scheduling decisions need to be coupled
with NodeManager heartbeats.  It doesn't sidestep any race conditions
because a NodeManager could die immediately after heartbeating.


On Sat, Sep 21, 2013 at 2:11 AM, Omkar Joshi  wrote:

> Hi Wei,
>
> Yes there is a clear lag between AM requesting resource and satisfying NM
> heartbeats (thereby we process the event) are received. Developers in
> project Tez ( http://incubator.apache.org/projects/tez.html ) have done
> some similar stuff. You can check it there. I hope it helps.
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* 
>
>
> On Fri, Sep 20, 2013 at 8:56 AM, Xuan Gong  wrote:
>
> > Hey, Wei:
> >   The nodeHeartBeat is used to let RM knows this NM is still alive.
> We
> > only assign containers from alive NM. Another thing is when scheduler
> > receives the nodeHeartBeat, the scheduler will get the container status
> > (such as completed, new launched) from NM, and it can use it to update
> the
> > resource.
> >
> >You can take a look those source codes, it can help you understand
> > better.
> > 1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
> > nodeheartbeat
> > 2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
> > heartbeat from NM, and send to RMNodeImpl
> > 3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat,
> and
> > do locally update.
> > 4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
> > potentially assign containers.
> >
> > Thanks
> >
> > Xuan
> >
> >
> > On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail 
> > wrote:
> >
> > > Hi, all,
> > >
> > > I have a simple question. Currently in YARN, the resource matching is
> > > triggered by the node manager heartbeat. That is, assignContainers() is
> > > only invoked when a new heartbeat comes in. Why we don't use resource
> > > request triggered mechanism? That is, when AM submits allocateRequest,
> we
> > > do the resource matching and assign containers.
> > >
> > > Does anybody have any idea about this?
> > >
> > > thanks,
> > > Wei
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: question about when do resource matching in YARN

2013-09-20 Thread Omkar Joshi
Hi Wei,

Yes there is a clear lag between AM requesting resource and satisfying NM
heartbeats (thereby we process the event) are received. Developers in
project Tez ( http://incubator.apache.org/projects/tez.html ) have done
some similar stuff. You can check it there. I hope it helps.

Thanks,
Omkar Joshi
*Hortonworks Inc.* 


On Fri, Sep 20, 2013 at 8:56 AM, Xuan Gong  wrote:

> Hey, Wei:
>   The nodeHeartBeat is used to let RM knows this NM is still alive. We
> only assign containers from alive NM. Another thing is when scheduler
> receives the nodeHeartBeat, the scheduler will get the container status
> (such as completed, new launched) from NM, and it can use it to update the
> resource.
>
>You can take a look those source codes, it can help you understand
> better.
> 1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
> nodeheartbeat
> 2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
> heartbeat from NM, and send to RMNodeImpl
> 3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat, and
> do locally update.
> 4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
> potentially assign containers.
>
> Thanks
>
> Xuan
>
>
> On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail 
> wrote:
>
> > Hi, all,
> >
> > I have a simple question. Currently in YARN, the resource matching is
> > triggered by the node manager heartbeat. That is, assignContainers() is
> > only invoked when a new heartbeat comes in. Why we don't use resource
> > request triggered mechanism? That is, when AM submits allocateRequest, we
> > do the resource matching and assign containers.
> >
> > Does anybody have any idea about this?
> >
> > thanks,
> > Wei
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: question about when do resource matching in YARN

2013-09-20 Thread Xuan Gong
Hey, Wei:
  The nodeHeartBeat is used to let RM knows this NM is still alive. We
only assign containers from alive NM. Another thing is when scheduler
receives the nodeHeartBeat, the scheduler will get the container status
(such as completed, new launched) from NM, and it can use it to update the
resource.

   You can take a look those source codes, it can help you understand
better.
1. NodeStatusUpdaterImpl::startStatusUpdater(). it used to send out the
nodeheartbeat
2. ResourceTrackerService::nodeHeartbeat(). This one is used to get
heartbeat from NM, and send to RMNodeImpl
3. RMNodeImpl::StatusUpdateWhenHealthyTransition().  Get the heartBeat, and
do locally update.
4. CapacityScheduler::nodeUpdate(). Processing the heartbeat info, and
potentially assign containers.

Thanks

Xuan


On Fri, Sep 20, 2013 at 7:17 AM, wei yan @ Gmail  wrote:

> Hi, all,
>
> I have a simple question. Currently in YARN, the resource matching is
> triggered by the node manager heartbeat. That is, assignContainers() is
> only invoked when a new heartbeat comes in. Why we don't use resource
> request triggered mechanism? That is, when AM submits allocateRequest, we
> do the resource matching and assign containers.
>
> Does anybody have any idea about this?
>
> thanks,
> Wei

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.