[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2020-09-02 Thread Vinod Kumar Vavilapalli (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189837#comment-17189837
 ] 

Vinod Kumar Vavilapalli commented on YARN-4692:
---

[~sujeet-a.hi...@db.com], all the work targeted here was already done via 
YARN-4793, YARN-5079 and related JIRAs. This JIRA and related tickets need some 
cleanup, that's all.

The docs on how to use this are here: 
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2020-09-02 Thread Sujeet-A (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189229#comment-17189229
 ] 

Sujeet-A commented on YARN-4692:


Hi Team,

Any update on this ?

We want to have a alternative to slider as it is retired.

 

Regards,

Sujeet

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-04-28 Thread Manoj Samel (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262807#comment-15262807
 ] 

Manoj Samel commented on YARN-4692:
---

Hi [~kasha] and team,

I would like to add another type of service to your list, that we are deploying 
using Apache slider. I am using slider terminology to explain current use case.

The service is a multi-tenant long running service. Imagine e.g. query service 
for large number of users. The application is deployed as one "slider cluster". 
For each tenant, we start a component using slider (i.e. a separate Yarn 
container). Key service requirements ...

1. Start each container (slider component) as tenant user - not a common user.

2. Large number of containers (hundreds to thousands - one per end user) to be 
managed by one AM (as slider stands).

For more details of of use case, see 
https://issues.apache.org/jira/browse/SLIDER-1114

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-04-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244843#comment-15244843
 ] 

Karthik Kambatla commented on YARN-4692:


(Discussed this offline with [~vinodkv]. Posting here so others are in the loop 
and provide their thoughts.)

Thanks for the quite comprehensive survey and putting the doc together, Vinod.

I would like for us to define what exactly we mean by services. I see three 
kinds:
# Long-running applications like streaming applications that run as a single 
user
# Services like HBase that run as a single user, but serve other users. These 
services might want to use Yarn for deployment and fault-tolerance, but not 
necessarily for sharing resources among its users or accounting their resource 
usage.
# Services like Impala/LLAP/Drill that serve other users and would like to use 
Yarn for sharing/accounting resources among them. 

The document covers several requirements of the first two cases. I would like 
for us to understand the requirements of the third case as well - e.g. 
requesting resources on behalf of other users, delegating resources, latency of 
allocation etc. 

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159726#comment-15159726
 ] 

Vinod Kumar Vavilapalli commented on YARN-4692:
---

Tx for starting the discussions everyone.

While we continue the discussions on various things, I'll start creating / 
moving sub-tasks under this JIRA so that we can divide-and-conquer each problem 
on its own.

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-19 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154548#comment-15154548
 ] 

Konstantinos Karanasos commented on YARN-4692:
--

Thanks [~vinodkv] for the document -- I think it is a great starting point. 
As we discussed, we will provide you with some additional use cases that will 
help iron out some more details.

Some thoughts regarding the scheduling of containers...
I do like the idea of [~leftnoteasy] to combine application priorities and 
expected task duration (YARN-1039) in order to handle different types of jobs.
That is, expected task duration is used to determine whether a task is going to 
be long-running (such as a service), while application priorities are used to 
decide how tasks get executed in the NM.
I think this also nicely captures the preemptability issue that was raised by 
[~asuresh] without the need for extra knowledge/fields.

That said, I still have a few concerns:
* Who determines priorities? On a cluster shared by multiple users ("competing" 
with each other for resources), most users will want to use the highest 
priority (unless there are pricing or other incentives).
* In order to do a proper placement of services, I think there is still some 
information that is missing regarding the resource needs of the service. For 
instance, a service might be latency-critical. If we could capture such 
information in the resource requests, we could automatically reason about 
affinity and anti-affinity constraints (e.g., avoid placing two services that 
are both latency critical on the same node).

Thoughts?

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153434#comment-15153434
 ] 

Wangda Tan commented on YARN-4692:
--

[~asuresh],

Thanks for reply, I can understand your concern now.

There're two different container priorities:
- Across application container priority, this inherits application priority, 
and we have ACL and/or quota ensures not everybody can claim their containers 
are most important.
- Container priority within an app, application can decide which container 
should be allocated first and which container should be preempted first.

While doing preemption,
- For across application priority, framework should strictly enforce it.
- For priority within an application, when scheduler puts container to 
to-be-preempted list, application will be notified, and it has chance to return 
another container other than selected.

Thoughts?

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-18 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152735#comment-15152735
 ] 

Arun Suresh commented on YARN-4692:
---

[~leftnoteasy], Am not entirely convinced that Priorities will handle every 
case.

Currently, a priority dictates the order in which the scheduler serves 
allocations to a AMs resource request. Assuming a service is composed of 
containers of 3 types (Slider i think currently calls them *roles* and each are 
mapped to a yarn priority). 

Taking the example of HBase, with Roles = Zookeeper, HBaseMaster and 
HBaseRegionServer in decreasing priority order. This ensures that zookeeper 
containers are started first.. and HBaseRS are started last. This establishes a 
dependency order between Roles. Now I would argue that the *Preemptability* 
cost of an HBaseRS container is higher than ZK, since, ZK can still function 
without any problems if 1 or 2 container goes down (depending on the size of 
the quorum) but if an RS goes down, it might take a while for the regions 
hosted by the RS to be available again.

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151477#comment-15151477
 ] 

Wangda Tan commented on YARN-4692:
--

Thanks [~vinodkv] and other folks working on this, this documentation is pretty 
comprehensive already, some thoughts/suggestions:

1) For running containers, instead of classifying them into service/batch, I 
would prefer to tag them by application priority. For example, 0 is production 
service tasks, 5 is batch job, etc. The reason is
- Service container is not always important than other containers
- One important service can preempt containers from less important services.

2) A container is service or batch depends on duration of the task, we had lots 
of discussions on YARN-1039 already.

3) For 3.2.2 container auto restart, beyond restart container when it dies, we 
could let framework check health of running tasks. For example, support embeded 
REST API to get healthy status of containers. With this, framework can restart 
malfunctioning containers.

4) For 3.2.7 Scheduling / Queue model
Beyond queue model, we should consider long running containers when reserving 
large container on node.

5) Debuggability for service container is also very important,
- Tools similar to [cAdvisor|https://github.com/google/cadvisor] could be very 
helpful to figure out issues of service tasks
- We also need tool to show aggregated scheduling-related information of 
apps/queues/cluster.

*For comments from [~asuresh]:*
bq. we can give applications the ability to specify Preemptability of 
containers in a particular role...
Instead of adding a new field, I think we can reuse container priority and 
application priority to describe preemptability.

bq. Allow LR Applications to specify peak, min and variance/mean (also many 
transient and steady-state) of a Resource request to allow schedulers to make 
better allocation decisions.
I think this is hard for end user to know. Our framework should be able to 
figure out such metrics for running containers. For requested new containers, 
we'd better assume they will use 100% of requested resources.

bq. In YARN-4597 Chris Douglas proposed ...
In my mind, YARN-4597 is targeted to solve low latency batch tasks, if service 
tasks running for one hour or more, it's not a big deal to take several minutes 
to setup it.

And agree that reservation system (YARN-1051) is the utimate solution of queue 
model and container allocation for services

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-16 Thread Marco Rabozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148405#comment-15148405
 ] 

Marco Rabozzi commented on YARN-4692:
-

Thanks [~vinodkv] for starting the discussion and [~asuresh] for the detailed 
comments on long running container scheduling. Overall, the document gives a 
very detailed overview of the current state for long running services support 
in YARN.

With respect to long running service upgrades, the proposal for allocation 
reuse (3.2.3) is very interesting since it allows to reduce the time needed for 
container upgrades. However, if the AM container is the one that needs to be 
upgraded, the RM should be aware of the process, otherwise, in case of 
subsequent AM or NM failures the AM might be restarted with old bits. I think 
that a possible solution would be to revise the design proposed in YARN-4470 to 
take into account allocation reuse. We could decouple the request to update the 
submission context within the RM from the actual updated *startContainer* 
request for the same AM allocation.


> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146246#comment-15146246
 ] 

Arun Suresh commented on YARN-4692:
---

Missed adding this to my previous comment:

# Container Auto-Scaling: 
Someway to allow user to specify rules like "Add/Remove Container for role if 
*mem*/*iops*/*cpu per container* exceeds/goes below threshold” when launching 
an LR App. We can probably design a rule specification that can be consumed by 
the service or else we can probably provide Interface hooks that can be 
extended/implemented by the end user and packaged with the App.

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146157#comment-15146157
 ] 

Arun Suresh commented on YARN-4692:
---

Thanks you for starting this [~vinodkv]. The Document itself looks pretty 
thorough and well thought out.

Couple of thoughts :

# Preemption and Reservation:
## The document (3.2.1) talks about the fact that Long Running (LR) Containers 
should be started on assured capacity (not resources over fair share). I posit 
LR Containers should *primarily* be start on over-committed resources (probably 
as {{OPPORTUNISTIC}} containers, see YARN-2882 and YARN-1011). The point of LR 
services is that the Service as a whole should be available. Individual 
container deaths/restarts should not affect the service.
## On a related note, we can give applications the ability to specify 
*Preemptability* of containers in a particular role. A low value could mean, 
preemption is very costly while a high value implies the service is still 
available if some containers die. For eg. if deploying HBase on YARN, HBase 
Master can have a *low* preemptability value while HBase Region Servers can 
probably have *higher* preemptability. 
## Allow LR Applications to specify *peak*, *min* and *variance*/*mean* (also 
many transient and steady-state) of a Resource request to allow schedulers to 
make better allocation decisions. Also allow users to specify *min*/*max* num 
containers required for a particular Service role. This can be used as a hint 
for Preemption if other short running tasks are starved.
## Currently Schedulers create a reservation for a container on a node with 
free resources but resource does not fit. The document suggests we should 
ensure that Nodes on which LR containers are already running should not accept 
reservations. I feel, we should leverage 
Peak/Min/Mean/Varience/transient/Steady-state resource demands to loosen this. 
For eg, even if Node may not satisfy Peak demand, if steady-state demand is 
satisfiable, the Peak demands can probably be met by a combination of 
leveraging YARN-2877 / YARN-1011 and YARN-4597 (I'll describe this below).
# Handling Low-latency resource Spikes in LR Containers:
## In YARN-4597 [~chris.douglas] proposed 1) new {{SCHEDULING}} container state 
2) a local *ContainerScheduler* that handles the scheduling (essentially in 
charge of moving container from {{SCHEDULING}} to {{RUNNING}} state) 3) 
Allowing the *ContainerScheduler* and *Localizer* be directly accessible to 
Containers running on the node.
## An LR container should be able to ask for more resources if required and 
shed excess resource when idling. YARN-1197 tried to add support for changing 
resources on an allocated container, but the design doc talks about the request 
making a round trip from AM to RM and back and then to the containers. 
Low-latency elasticity can be probably be achieved using a combination of 
YARN-2877 and leveraging the NM local ContainerScheduler
# Queue Modeling:
## When LR Tasks are mixed with Short running Tasks, since LR tasks may never 
end, resources might always be tied up. I foresee some alleviation of this by 
probably ensuring some % of queue cap always available for non-LR tasks. Also, 
probably some more intelligent resource accounting using the Reservation system 
YARN-1051 would help ?





> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)