Hi,

  Should we consider the node based requests if it works with Capacity
Scheduler or avoid 2b approach altogether? I checked that node requests do
not work with fair scheduler on CDH cluster. Yarn does not return any
container if hostname is given in the container request. I am trying to
setup a small virtual hortonworks cluster to check the this behavior on
that.
YARN-2027 <https://issues.apache.org/jira/browse/YARN-2027> mentioned that
container requests are not honored in capacity scheduler too. But I am not
sure if it is because of distro dependent issue. Please share insights.

@Vlad, Adding support for regular expression sounds good. We could
translate to list of operator names internally based on regex.

@Yogi,  I went with a list of strings for attribute because "O2, O3" could
be a valid single operator name too :)
I am not sure of ways to implement anti-affinity across application. Though
something to consider for later iteration.

Thanks,
Isha

On Wed, Jan 20, 2016 at 8:59 PM, Thomas Weise <[email protected]>
wrote:

> https://issues.apache.org/jira/browse/SLIDER-82
>
>
> On Wed, Jan 20, 2016 at 8:56 PM, Thomas Weise <[email protected]>
> wrote:
>
> > The point was that containers are taken away from other apps that may
> have
> > to discard work etc. It's not good style to claim resources and not use
> > them eventually :-)
> >
> > For this feature it is necessary to look at the scheduler
> > capabilities/semantics and limitations. For example, don't bet
> exclusively
> > on node requests if the goal is for it to work with FairScheduler.
> >
> > Also look at Slider, which just recently added support for anti-affinity
> > (using node requests). When you run it on the CDH cluster, it probably
> > won't work...
> >
> >
> > On Wed, Jan 20, 2016 at 3:19 PM, Pramod Immaneni <[email protected]
> >
> > wrote:
> >
> >> Once released won't the containers be available again in the pool. This
> >> would only be optional and not mandatory.
> >>
> >> Thanks
> >>
> >> On Tue, Jan 19, 2016 at 2:02 PM, Thomas Weise <[email protected]>
> >> wrote:
> >>
> >> > How about also supporting a minor variation of it as an option
> >> > > where it greedily gets the total number of containers and discards
> >> ones
> >> > it
> >> > > can't use and repeats the process for the remaining till everything
> >> has
> >> > > been allocated.
> >> >
> >> >
> >> > This is problematic as with resource preemption these containers will
> be
> >> > potentially taken away from other applications and then thrown away.
> >> >
> >> >
> >> >
> >> >
> >> > > Also does it make sense to support anti-cluster affinity?
> >> > >
> >> > > Thanks
> >> > >
> >> > > On Tue, Jan 19, 2016 at 1:21 PM, Isha Arkatkar <
> [email protected]>
> >> > > wrote:
> >> > >
> >> > > > Hi all,
> >> > > >
> >> > > >    We want add support for Anti-affinity in Apex to allow
> >> applications
> >> > to
> >> > > > launch specific physical operators on different nodes(APEXCORE-10
> >> > > > <https://issues.apache.org/jira/browse/APEXCORE-10>). Want to
> >> request
> >> > > your
> >> > > > suggestions/ideas for the same!
> >> > > >
> >> > > >   The reasons for using anti-affinity in operators could be: to
> >> ensure
> >> > > > reliability, for performance reasons (such as application may not
> >> want
> >> > 2
> >> > > > i/o intensive operators to land on the same node to improve
> >> > performance)
> >> > > or
> >> > > > for some application specific constraints(for example,  2
> partitions
> >> > > cannot
> >> > > > be run on the same node since they use same port number). This is
> >> the
> >> > > > general rationale for adding Anti-affinity support.
> >> > > >
> >> > > > Since, Yarn does not support anti-affinity yet (YARN-1042
> >> > > > <https://issues.apache.org/jira/browse/YARN-1042>), we need to
> >> > implement
> >> > > > the logic in AM. Wanted to get your views on following aspects for
> >> this
> >> > > > implementation:
> >> > > >
> >> > > > *1. How to specify anti-affinity for physical operators/partitions
> >> in
> >> > > > application:*
> >> > > >     One way for this is to have an attribute for setting
> >> anti-affinity
> >> > at
> >> > > > the logical operator context. And an operator can set this
> attribute
> >> > with
> >> > > > list of operator names which should not be collocated.
> >> > > >      Consider dag with 3 operators:
> >> > > >      TestOperator o1 = dag.addOperator("O1", new TestOperator());
> >> > > >      TestOperator o2 = dag.addOperator("O2", new TestOperator());
> >> > > >      TestOperator o3 = dag.addOperator("O3", new TestOperator());
> >> > > >
> >> > > >  To set anti-affinity for O1 operator:
> >> > > >     dag.setAttribute(o1, OperatorContext.ANTI_AFFINITY, new
> >> > > > ArrayList<String>(Arrays.asList("O2", "O3")));
> >> > > >      This would mean O1 should not be allocated on nodes
> containing
> >> > > > operators O2 and O3. This applies to all allocated partitions of
> O1,
> >> > O2,
> >> > > > O3.
> >> > > >
> >> > > >    Also, if same operator name is part of anti-affinity list, it
> >> means
> >> > > > partitions of the operator should not be allocated on the same
> node.
> >> > > > example:
> >> > > >     dag.setAttribute(o2, OperatorContext.ANTI_AFFINITY, new
> >> > > > ArrayList<String>(Arrays.asList("O2")));
> >> > > >     This indicates anti-affinity between all partitions of O2.
> i.e.
> >> all
> >> > > > partitions of O2 should be launched on different nodes.
> >> > > >
> >> > > >    Based on the anti-affinity attribute specified for logical
> >> operator,
> >> > > > during physical plan creation, we can add this list to each
> >> > PTContainer.
> >> > > > This in turn will be available for Stram for sending container
> >> requests
> >> > > > accordingly.
> >> > > >
> >> > > >    Please suggest if there is a better way to express this intent.
> >> > > >
> >> > > > *2. How to implement anti-affinity in AM*
> >> > > >    There are 2 ways we can implement this:
> >> > > >   * a. Blacklisting of nodes: *We can group the physical container
> >> > > requests
> >> > > > based on anti-affinity requirements and send allocation requests
> for
> >> > > > containers in groups. After first group is done, blacklist the
> nodes
> >> > > before
> >> > > > sending second group of container requests. This will ensure that
> >> the
> >> > > > containers with anti-affinity requirements  will be allocated on
> >> > > different
> >> > > > nodes.
> >> > > > *   b. Node specific container request: *Explore and create a map
> of
> >> > > nodes
> >> > > > present in the cluster and send allocation request for container
> on
> >> a
> >> > > > specific node, honoring anti-affinity. There are couple of open
> Yarn
> >> > > Jiras
> >> > > > for node specific container requests: YARN-1412
> >> > > > <https://issues.apache.org/jira/browse/YARN-1412>, YARN-2027
> >> > > > <https://issues.apache.org/jira/browse/YARN-2027>. So, need to
> >> check
> >> > if
> >> > > > this is a plausible approach.
> >> > > >
> >> > > > *3. Strict Vs Relaxed anti-affinity*
> >> > > >   Depending on cluster resources availability, it may not be
> >> possible
> >> > to
> >> > > > honor all anti-affinity requirements specified.
> >> > > > *Strict Anti-affinity:* AM will keep trying to allocate containers
> >> as
> >> > per
> >> > > > anti-affinity requirements indefinitely. This behavior will be
> >> similar
> >> > to
> >> > > > how an application shows in ACCEPTED state, till resources are
> >> > available
> >> > > to
> >> > > > launch in cluster.
> >> > > > *Relaxed Anti-affinity:* AM will drop the anti-affinity constraint
> >> > after
> >> > > a
> >> > > > certain timeout.
> >> > > >
> >> > > > We need a way to set this attribute through application. (Either
> in
> >> > > > operator context or in DAGContext for application wide setting.)
> >> > > >
> >> > > > *4. How do we unit test this feature*
> >> > > >   We could use Mockito for mocking Yarn behaviors and test only AM
> >> > > > implementation, since it may not be easy to simulate some
> scenarios
> >> > > > manually in cluster. Please suggest if there are better ways to
> test
> >> > > this.
> >> > > >
> >> > > > Please suggest improvements or any other ideas on all of the
> above.
> >> > > >
> >> > > > Thanks!
> >> > > > Isha
> >> > > >
> >> > > > P.S. Sorry for long email. Please let me know if I should start
> >> > separate
> >> > > > threads for any of the above points.
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Reply via email to