[jira] [Updated] (YARN-5982) Simplifying opportunistic container parameters and metrics

2016-12-08 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5982:
-
Attachment: YARN-5982.002.patch

Thanks for the review, [~asuresh].

I addressed your comments. I put back to the proto file the increment resource, 
but I did not move it from the FairScheduler configuration to the general 
YarnConfiguration. Let's do that in a separate JIRA if it is needed.

I also fixed the checkstyle issues and the related test case that was failing.

> Simplifying opportunistic container parameters and metrics
> --
>
> Key: YARN-5982
> URL: https://issues.apache.org/jira/browse/YARN-5982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5982.001.patch, YARN-5982.002.patch
>
>
> This JIRA removes some of the parameters that are related to opportunistic 
> containers (e.g., min/max memory/cpu). Instead, we will be using the 
> parameters already used by guaranteed containers.
> The goal is to reduce the number of parameters that need to be used by the 
> user.
> We also fix a small issue related to the container metrics (opportunistic 
> memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3409) Add constraint node labels

2016-12-08 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733301#comment-15733301
 ] 

Konstantinos Karanasos commented on YARN-3409:
--

{{PlacementStrategy}} sounds good to me. {{PlacementConstraints}} or something 
similar might be even more descriptive.

+1 for using the same expression for defining both (anti-)affinity and label 
constraints.
I was wondering whether we could even use a single type of constraint to 
express all these different constraint types.
Let me think about it a bit more and I will let you know if I find a concrete 
solution.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5982) Simplifying opportunistic container parameters and metrics

2016-12-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5982:
-
Target Version/s: 2.9.0, 3.0.0-alpha2  (was: 3.0.0-alpha2)

> Simplifying opportunistic container parameters and metrics
> --
>
> Key: YARN-5982
> URL: https://issues.apache.org/jira/browse/YARN-5982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5982.001.patch
>
>
> This JIRA removes some of the parameters that are related to opportunistic 
> containers (e.g., min/max memory/cpu). Instead, we will be using the 
> parameters already used by guaranteed containers.
> The goal is to reduce the number of parameters that need to be used by the 
> user.
> We also fix a small issue related to the container metrics (opportunistic 
> memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5982) Simplifying opportunistic container parameters and metrics

2016-12-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5982:
-
Fix Version/s: 3.0.0-alpha2
   2.9.0

> Simplifying opportunistic container parameters and metrics
> --
>
> Key: YARN-5982
> URL: https://issues.apache.org/jira/browse/YARN-5982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5982.001.patch
>
>
> This JIRA removes some of the parameters that are related to opportunistic 
> containers (e.g., min/max memory/cpu). Instead, we will be using the 
> parameters already used by guaranteed containers.
> The goal is to reduce the number of parameters that need to be used by the 
> user.
> We also fix a small issue related to the container metrics (opportunistic 
> memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5982) Simplifying opportunistic container parameters and metrics

2016-12-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5982:
-
Attachment: YARN-5982.001.patch

Attaching patch.

> Simplifying opportunistic container parameters and metrics
> --
>
> Key: YARN-5982
> URL: https://issues.apache.org/jira/browse/YARN-5982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5982.001.patch
>
>
> This JIRA removes some of the parameters that are related to opportunistic 
> containers (e.g., min/max memory/cpu). Instead, we will be using the 
> parameters already used by guaranteed containers.
> The goal is to reduce the number of parameters that need to be used by the 
> user.
> We also fix a small issue related to the container metrics (opportunistic 
> memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5982) Simplifying opportunistic container parameters and metrics

2016-12-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5982:
-
Summary: Simplifying opportunistic container parameters and metrics  (was: 
Simplifying some opportunistic container parameters and metrics)

> Simplifying opportunistic container parameters and metrics
> --
>
> Key: YARN-5982
> URL: https://issues.apache.org/jira/browse/YARN-5982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> This JIRA removes some of the parameters that are related to opportunistic 
> containers (e.g., min/max memory/cpu). Instead, we will be using the 
> parameters already used by guaranteed containers.
> The goal is to reduce the number of parameters that need to be used by the 
> user.
> We also fix a small issue related to the container metrics (opportunistic 
> memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5982) Simplifying some opportunistic container parameters and metrics

2016-12-07 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5982:


 Summary: Simplifying some opportunistic container parameters and 
metrics
 Key: YARN-5982
 URL: https://issues.apache.org/jira/browse/YARN-5982
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


This JIRA removes some of the parameters that are related to opportunistic 
containers (e.g., min/max memory/cpu). Instead, we will be using the parameters 
already used by guaranteed containers.
The goal is to reduce the number of parameters that need to be used by the user.

We also fix a small issue related to the container metrics (opportunistic 
memory reported in GB in Web UI, although it was in MB).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-3409) Add constraint node labels

2016-12-07 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730729#comment-15730729
 ] 

Konstantinos Karanasos edited comment on YARN-3409 at 12/8/16 1:30 AM:
---

Hey guys, apologies for the late reply.
Here are my thoughts...

bq. Add a new field for constraint expression, and also for 
affnity/anti-affinity (Per suggested by Kostas). This should have minimum 
impact to existing features. But after this, the "nodeLabelExpression becomes a 
little ambiguous, we may need to deprecate existing nodeLabelExpression.
Agreed with that, with one clarification: do you mean having an extra 
affinity/anti-affinity constraint expression or use the same constraint 
expression? Probably we will need a separate one.

bq. Extend existing NodeLabel object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).
I followed your discussion on this and on evaluating the constraints. I also 
had an offline discussion with [~chris.douglas].
I will suggest to have an even simpler approach than the one Wangda proposed.
I believe we should have a first version with just boolean expressions, that 
is, simply request whether a label exists or not (possibly including negation 
of boolean expressions). 
In other words, I suggest to have neither a constraint type nor a value.
Let's have a first simple version of (boolean) labels that works. In a future 
iteration of this, we can add attributes (i.e., with values) instead of labels.

Having simple labels allows us to bypass the problem of constraint types. Like 
Wangda says, constraint types are not really solving the problem of comparing 
values, given that people will right their values in different formats. You can 
also give a look at YARN-4476 for an efficient boolean expression matcher.
For example, using simple labels, one node can be annotated with label "Java6". 
Then a task that requires at least Java 5 can request for a node with "Java5 || 
Java6". I think that with our current use cases, this will be sufficient.

Let me know what you think.


was (Author: kkaranasos):
Hey guys, apologies for the late reply.
Here are my thoughts...

bq. Add a new field for constraint expression, and also for 
affnity/anti-affinity (Per suggested by Kostas). This should have minimum 
impact to existing features. But after this, the "nodeLabelExpression becomes a 
little ambiguous, we may need to deprecate existing nodeLabelExpression.
Agreed with that, with one clarification: do you mean having an extra 
affinity/anti-affinity constraint expression or use the same constraint 
expression? Probably we will need a separate one.

bq. Extend existing NodeLabel object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).
I followed your discussion on this and on evaluating the constraints. I also 
had an offline discussion with [~chris.douglas].
I will suggest to have an even simpler approach than the one Wangda proposed.
I believe we should have a first version with just boolean expressions, that 
is, simply request whether a label exists or not (possibly including negation 
of boolean expressions). 
In other words, I suggest to have neither a constraint type nor a value.
Let's have a first simple version of (boolean) labels that works. In a future 
iteration of this, we can add attributes (i.e., with values) instead of labels.

Having simple labels allows us to bypass the problem of constraint types. Like 
Wangda says, constraint types are not really solving the problem of comparing 
values, given that people will right their values in different formats. You can 
also give a look at YARN-44676 for an efficient boolean expression matcher.
For example, using simple labels, one node can be annotated with label "Java6". 
Then a task that requires at least Java 5 can request for a node with "Java5 || 
Java6". I think that with our current use cases, this will be sufficient.

Let me know what you think.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). 

[jira] [Comment Edited] (YARN-3409) Add constraint node labels

2016-12-07 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730729#comment-15730729
 ] 

Konstantinos Karanasos edited comment on YARN-3409 at 12/8/16 1:30 AM:
---

Hey guys, apologies for the late reply.
Here are my thoughts...

bq. Add a new field for constraint expression, and also for 
affnity/anti-affinity (Per suggested by Kostas). This should have minimum 
impact to existing features. But after this, the "nodeLabelExpression becomes a 
little ambiguous, we may need to deprecate existing nodeLabelExpression.
Agreed with that, with one clarification: do you mean having an extra 
affinity/anti-affinity constraint expression or use the same constraint 
expression? Probably we will need a separate one.

bq. Extend existing NodeLabel object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).
I followed your discussion on this and on evaluating the constraints. I also 
had an offline discussion with [~chris.douglas].
I will suggest to have an even simpler approach than the one Wangda proposed.
I believe we should have a first version with just boolean expressions, that 
is, simply request whether a label exists or not (possibly including negation 
of boolean expressions). 
In other words, I suggest to have neither a constraint type nor a value.
Let's have a first simple version of (boolean) labels that works. In a future 
iteration of this, we can add attributes (i.e., with values) instead of labels.

Having simple labels allows us to bypass the problem of constraint types. Like 
Wangda says, constraint types are not really solving the problem of comparing 
values, given that people will right their values in different formats. You can 
also give a look at YARN-44676 for an efficient boolean expression matcher.
For example, using simple labels, one node can be annotated with label "Java6". 
Then a task that requires at least Java 5 can request for a node with "Java5 || 
Java6". I think that with our current use cases, this will be sufficient.

Let me know what you think.


was (Author: kkaranasos):
Hey guys, apologies for the late reply.
Here are my thoughts...

bq. Add a new field for constraint expression, and also for 
affnity/anti-affinity (Per suggested by Kostas). This should have minimum 
impact to existing features. But after this, the "nodeLabelExpression becomes a 
little ambiguous, we may need to deprecate existing nodeLabelExpression.
Agreed with that, with one clarification: do you mean having an extra 
affinity/anti-affinity constraint expression or use the same constraint 
expression? Probably we will need a separate one.

bq. Extend existing NodeLabel object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).
I followed your discussion on this and on evaluating the constraints. I also 
had an offline discussion with [~chris.douglas].
I will suggest to have an even simpler approach than the one Wangda proposed.
I believe we should have a first version with just boolean expressions, that 
is, simply request whether a label exists or not (possibly including negation 
of boolean expressions). 
In other words, I suggest to have neither a constraint type nor a value.
Let's have a first simple version of (boolean) labels that works. In a future 
iteration of this, we can add attributes (i.e., with values) instead of labels.

Having simple labels allows us to bypass the problem of constraint types. Like 
Wangda says, constraint types are not really solving the problem of comparing 
values, given that people will right their values in different formats. You can 
also give a look at YARN-4467 for an efficient boolean expression matcher.
For example, using simple labels, one node can be annotated with label "Java6". 
Then a task that requires at least Java 5 can request for a node with "Java5 || 
Java6". I think that with our current use cases, this will be sufficient.

Let me know what you think.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). 

[jira] [Commented] (YARN-3409) Add constraint node labels

2016-12-07 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730729#comment-15730729
 ] 

Konstantinos Karanasos commented on YARN-3409:
--

Hey guys, apologies for the late reply.
Here are my thoughts...

bq. Add a new field for constraint expression, and also for 
affnity/anti-affinity (Per suggested by Kostas). This should have minimum 
impact to existing features. But after this, the "nodeLabelExpression becomes a 
little ambiguous, we may need to deprecate existing nodeLabelExpression.
Agreed with that, with one clarification: do you mean having an extra 
affinity/anti-affinity constraint expression or use the same constraint 
expression? Probably we will need a separate one.

bq. Extend existing NodeLabel object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).
I followed your discussion on this and on evaluating the constraints. I also 
had an offline discussion with [~chris.douglas].
I will suggest to have an even simpler approach than the one Wangda proposed.
I believe we should have a first version with just boolean expressions, that 
is, simply request whether a label exists or not (possibly including negation 
of boolean expressions). 
In other words, I suggest to have neither a constraint type nor a value.
Let's have a first simple version of (boolean) labels that works. In a future 
iteration of this, we can add attributes (i.e., with values) instead of labels.

Having simple labels allows us to bypass the problem of constraint types. Like 
Wangda says, constraint types are not really solving the problem of comparing 
values, given that people will right their values in different formats. You can 
also give a look at YARN-4467 for an efficient boolean expression matcher.
For example, using simple labels, one node can be annotated with label "Java6". 
Then a task that requires at least Java 5 can request for a node with "Java5 || 
Java6". I think that with our current use cases, this will be sufficient.

Let me know what you think.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5646) Documentation for scheduling of OPPORTUNISTIC containers

2016-12-07 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730332#comment-15730332
 ] 

Konstantinos Karanasos commented on YARN-5646:
--

Thanks for the detailed feedback, [~templedf]!
I also got some offline feedback from [~curino] yesterday. 

I will incorporate your changes and upload a new version.

Regarding the min queue length and wait time, I will improve the description -- 
it is indeed not easy to understand what it does in its current form. These 
parameters are used to "not dequeue containers for load rebalancing purposes, 
if queue length is smaller than X tasks (or seconds)". So if you have shorter 
queues than that, you simply don't perform any action.

As per Carlo's suggestion too, I will raise a JIRA to simplify some of the 
properties related to opportunistic containers, including the incremental one. 
For instance, I don't think there will be many cases where we will want the 
min/max opportunistic container size to be different from the guaranteed one.

> Documentation for scheduling of OPPORTUNISTIC containers
> 
>
> Key: YARN-5646
> URL: https://issues.apache.org/jira/browse/YARN-5646
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5646.001.patch
>
>
> This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
> containers.
> It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
> scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3409) Add constraint node labels

2016-11-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707465#comment-15707465
 ] 

Konstantinos Karanasos commented on YARN-3409:
--

Hi guys, thank you for driving this, for the documentation, and for all the 
discussion. I think it is a super-useful feature to have.
We also built a prototype over the summer with the end goal of supporting 
placement constraints (of the form affinity/anti-affinity/cardinality, similar 
in spirit to YARN-1042), and had to add some initial support for similar node 
labels along the way.
Please find some comments below.

# As was mentioned above in one of [~leftnoteasy]'s comments too, I also 
(strongly :)) suggest to use these ConstraintNodeLabels in the context of 
YARN-1042 for supporting (anti-)affinity constraints as well. I think it will 
greatly avoid duplicate effort and simplify the code.
# On a similar note, can these ContraintNodeLabels be added/removed 
dynamically? For example, when a container starts its execution, it might 
contain some attributes (to be added -- I know such attributes cannot be 
specified at the moment). Those attributes will then be added to the node's 
labels, for the time the container is running. This can be useful for 
(anti-)affinity constraints. For instance, a task can add the label 
"HBase-master", and then another resource request can have a constraint of the 
form "don't put me at a node with an HBase-master label". What do you think?
# A few people above mentioned that the naming of ContainerNodeLabels might not 
be ideal. I think they look more like attributes (as in key-value pairs), so we 
might consider using a name that denotes that (labels sound to me more like 
something that exists or not, but does not have a value).
# I like that you don't take headroom into account when it comes to constraint 
label expressions.
# +1 for Option 1. It might also be that the implementation of 
ConstraintNodeLabels will be easier at some places than that of 
NodeLabels/Partitions (e.g., given there is no need for supporting headroom). 
In terms of logistics, +1 for the branch too. I think we should make this an 
umbrella JIRA.
# Can you please give an example of a cluster-level constraint?
# bq. Constraints will be matched within the scope of a node partition.
Making sure I understand.. Why do we need this constraint? I think they are 
orthogonal, right? Unless you mean that if the user specifies a constraint, it 
has to be taken into account too, which I understand.
# Also adding one last thing we did in our prototype that I think is related to 
the locality (node/rack/any) discussion above and might be useful to consider. 
We assumed that the ConstraintNodeLabels are following the hierarchy of the 
cluster. That is, a rack was inheriting the ConstraintNodeLabels of all its 
nodes. A detail here is that we considered only existential 
ConstraintNodeLabels (as I mentioned above, without values), which avoids 
conflicts. In the more general case you are describing, it is not clear what 
happens if a node of the rack has Java 6 and the other Java 7 (not clear what 
should be the label of the rack). We will need to resolve conflicts in those 
case. However, I think that design is quite powerful. Think that eventually we 
can even define different logical classes of nodes and register them in the RM. 
For instance, group nodes that belong to the same upgrade domain (being 
upgraded at the same time -- we see this use case a lot in our clusters).

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--

[jira] [Commented] (YARN-5886) Dynamically prioritize execution of opportunistic containers (NM queue reordering)

2016-11-29 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706181#comment-15706181
 ] 

Konstantinos Karanasos commented on YARN-5886:
--

Thank you for the feedback, [~cxcw].

bq. And also Microsoft also published a paper in ATC to talk about this 
feature. Here is some of my concerns.
Which paper are you referring too? We had a paper in EuroSys 2016 ("Efficient 
Queue Management for Cluster Scheduling"), in which we are investigating 
different queue reordering strategies, along with other techniques for 
efficient queue management (queue sizing, placement, etc.). We called the 
system Yaq (we had both a centralized and a distributed scheduling version). Is 
that the paper you meant?
Many of the techniques we are planning to add here will originate from Yaq.

bq. 1. How the local NM CotnaienrScheduler coordinate with global scheduler. 
since global scheduler will try to keep fair and grarantee share across the 
applications(queue).
So the way we are planning to do this is by letting the global scheduler send 
the tasks to the nodes. Then the reordering happens only locally at each node 
(only for the tasks that are queued at the moment). Note that reordering is 
done only for opportunistic containers (guaranteed are not allowed to be 
queued). This way we are not affecting the fairness guarantees of guaranteed 
containers.
If we want to do fairness across opportunistic containers, we will need some 
additional techniques (we did this through a timeout in the EuroSys paper).
Does this make sense or you had something else in mind?

bq. 2. Nodemanger may not know(or estimate) the runtime for queued container. 
Falsely estimation(mistake a long-running as a short-running) may cause serious 
results.(inverse priority?)
That is a good point. In the initial strategies, we are planning to not take 
into account the task duration (because it might not always be available or 
might be imprecise like you say). One way is to take into account the progress 
of the job, in terms of tasks completed. Later, if we introduce task durations, 
we can have even better strategies. But we will have to make sure we are robust 
in case of mis-estimations.

> Dynamically prioritize execution of opportunistic containers (NM queue 
> reordering)
> --
>
> Key: YARN-5886
> URL: https://issues.apache.org/jira/browse/YARN-5886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> Currently the {{ContainerScheduler}} in the NM picks the next queued 
> opportunistic container to be executed in a FIFO manner. That is, we first 
> execute containers that arrived first at the NM.
> This JIRA proposes to add pluggable queue reordering strategies at the NM 
> that will dynamically determine which opportunistic container will be 
> executed next.
> For example, we can choose to prioritize containers that belong to jobs which 
> are closer to completion, or containers that are short-running (if such 
> information is available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5646) Documentation for scheduling of OPPORTUNISTIC containers

2016-11-18 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677692#comment-15677692
 ] 

Konstantinos Karanasos commented on YARN-5646:
--

I understand the concern about future work. I added it on purpose, so that 
people that read the documentation, can get an idea of open items (and even 
contribute to them).
But if you all think it's not suitable, I can remove it.

[~kasha], I also included the motivation for over-commitment through 
opportunistic containers, but made clear in the text that we do not yet support 
it. Once over-commitment is also available, we will update the document.

> Documentation for scheduling of OPPORTUNISTIC containers
> 
>
> Key: YARN-5646
> URL: https://issues.apache.org/jira/browse/YARN-5646
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5646.001.patch
>
>
> This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
> containers.
> It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
> scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5646) Documentation for scheduling of OPPORTUNISTIC containers

2016-11-17 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675677#comment-15675677
 ] 

Konstantinos Karanasos commented on YARN-5646:
--

Please let's wait for a few days before committing this.

> Documentation for scheduling of OPPORTUNISTIC containers
> 
>
> Key: YARN-5646
> URL: https://issues.apache.org/jira/browse/YARN-5646
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5646.001.patch
>
>
> This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
> containers.
> It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
> scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5646) Documentation for scheduling of OPPORTUNISTIC containers

2016-11-17 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5646:
-
Attachment: YARN-5646.001.patch

Attaching documentation.

> Documentation for scheduling of OPPORTUNISTIC containers
> 
>
> Key: YARN-5646
> URL: https://issues.apache.org/jira/browse/YARN-5646
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5646.001.patch
>
>
> This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
> containers.
> It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
> scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668916#comment-15668916
 ] 

Konstantinos Karanasos commented on YARN-1593:
--

Thanks for starting this! As [~asuresh] and [~hrsharma] pointed out, this is 
very related to the container pooling we have been thinking of, so it's great 
to see there is more work to this direction.

Here are some first thoughts:
- There seems to be a common need to have containers not belonging to an AM. I 
like your analysis about the pros and cons of the three approaches. Ideally, 
and if possible, it would be good to agree on an approach that is not hybrid, 
i.e., to not have some containers going through option (1) and some others 
through option (3), but rather have a unified approach. In container pooling we 
have thought of having a component in the RM that manages how many "system" 
containers will running at each node, but we are willing to adopt another 
approach if it is more suitable.
- Looking both at your document and the comments above, it seems that no 
approach can properly tackle the dependencies problem. Probably we should solve 
this in the scheduler: just like there will be support for (anti-)affinity 
constraints, we can add support for dependencies in the scheduler, e.g., to not 
schedule that container to a node before a shuffle container is running on that 
node.
- Although I like your proposal of using a new ExecutionType for the system 
containers, I am not sure it is always desirable to couple system containers 
with the highest priority ExecutionType. For instance, there can be system 
containers that are not as important and can be preempted to make space if 
needed. Also, apart from the execution priority, I am not sure if the 
ExecutionType should determine whether a container should be automatically 
relaunched. If we end up having a component managing those containers, maybe it 
is its role to determine if they get restarted upon failure (irrespective of 
their ExecutionType).

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5887) Policies for choosing which opportunistic containers to kill

2016-11-15 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5887:


 Summary: Policies for choosing which opportunistic containers to 
kill
 Key: YARN-5887
 URL: https://issues.apache.org/jira/browse/YARN-5887
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


When a guaranteed container arrives at an NM but there are no resources to 
start its execution, opportunistic containers will be killed to make space for 
the guaranteed container.

At the moment, we kill opportunistic containers in reverse order of arrival 
(first the most recently started ones). This is not always the right decision. 
For example, we might want to minimize the number of containers killed: to 
start a 6GB container, we could kill one 6GB opportunistic or three 2GB ones. 
Another example would be to refrain from killing containers of jobs that are 
very close to completion (we have to pass job completion information to the NM 
in that case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5886) Dynamically prioritize execution of opportunistic containers (NM queue reordering)

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos reassigned YARN-5886:


Assignee: Konstantinos Karanasos

> Dynamically prioritize execution of opportunistic containers (NM queue 
> reordering)
> --
>
> Key: YARN-5886
> URL: https://issues.apache.org/jira/browse/YARN-5886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> Currently the {{ContainerScheduler}} in the NM picks the next queued 
> opportunistic container to be executed in a FIFO manner. That is, we first 
> execute containers that arrived first at the NM.
> This JIRA proposes to add pluggable queue reordering strategies at the NM 
> that will dynamically determine which opportunistic container will be 
> executed next.
> For example, we can choose to prioritize containers that belong to jobs which 
> are closer to completion, or containers that are short-running (if such 
> information is available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5886) Dynamically prioritize execution of opportunistic containers (NM queue reordering)

2016-11-15 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5886:


 Summary: Dynamically prioritize execution of opportunistic 
containers (NM queue reordering)
 Key: YARN-5886
 URL: https://issues.apache.org/jira/browse/YARN-5886
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


Currently the {{ContainerScheduler}} in the NM picks the next queued 
opportunistic container to be executed in a FIFO manner. That is, we first 
execute containers that arrived first at the NM.

This JIRA proposes to add pluggable queue reordering strategies at the NM that 
will dynamically determine which opportunistic container will be executed next.
For example, we can choose to prioritize containers that belong to jobs which 
are closer to completion, or containers that are short-running (if such 
information is available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4972) Cleanup ContainerScheduler tests to remove long sleep times

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4972:
-
Parent Issue: YARN-5541  (was: YARN-4742)

> Cleanup ContainerScheduler tests to remove long sleep times
> ---
>
> Key: YARN-4972
> URL: https://issues.apache.org/jira/browse/YARN-4972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4972) Cleanup ContainerScheduler tests to remove long sleep times

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4972:
-
Summary: Cleanup ContainerScheduler tests to remove long sleep times  (was: 
Cleanup QueuingContainerManager tests to remove long sleep times)

> Cleanup ContainerScheduler tests to remove long sleep times
> ---
>
> Key: YARN-4972
> URL: https://issues.apache.org/jira/browse/YARN-4972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2889) Limit in the number of opportunistic container requests per AM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2889:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Limit in the number of opportunistic container requests per AM
> --
>
> Key: YARN-2889
> URL: https://issues.apache.org/jira/browse/YARN-2889
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
>
> We introduce a way to limit the number of queueable requests that each AM can 
> submit to the LocalRM.
> This way we can restrict the number of queueable containers handed out by the 
> system, as well as throttle down misbehaving AMs (asking for too many 
> queueable containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2889) Limit in the number of opportunistic container requests per AM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2889:
-
Summary: Limit in the number of opportunistic container requests per AM  
(was: Limit in the number of queueable container requests per AM)

> Limit in the number of opportunistic container requests per AM
> --
>
> Key: YARN-2889
> URL: https://issues.apache.org/jira/browse/YARN-2889
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
>
> We introduce a way to limit the number of queueable requests that each AM can 
> submit to the LocalRM.
> This way we can restrict the number of queueable containers handed out by the 
> system, as well as throttle down misbehaving AMs (asking for too many 
> queueable containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5216) Expose configurable preemption policy for OPPORTUNISTIC containers running on the NM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5216:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-5541

> Expose configurable preemption policy for OPPORTUNISTIC containers running on 
> the NM
> 
>
> Key: YARN-5216
> URL: https://issues.apache.org/jira/browse/YARN-5216
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: distributed-scheduling
>Reporter: Arun Suresh
>Assignee: Hitesh Sharma
>  Labels: oct16-hard
> Attachments: YARN5216.001.patch, yarn5216.002.patch
>
>
> Currently, the default action taken by the QueuingContainerManager, 
> introduced in YARN-2883, when a GUARANTEED Container is scheduled on an NM 
> with OPPORTUNISTIC containers using up resources, is to KILL the running 
> OPPORTUNISTIC containers.
> This JIRA proposes to expose a configurable hook to allow the NM to take a 
> different action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5415) Add support for NodeLocal and RackLocal OPPORTUNISTIC requests

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5415:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Add support for NodeLocal and RackLocal OPPORTUNISTIC requests
> --
>
> Key: YARN-5415
> URL: https://issues.apache.org/jira/browse/YARN-5415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
>
> Currently, the Distirbuted Scheduling framework only support ResourceRequests 
> with *ANY* resource name and additionally requires that the resource requests 
> have relaxLocality turned on.
> This jira seeks to add support for Node and Rack local allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2886) Estimating waiting time in NM container queues

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2886:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Estimating waiting time in NM container queues
> --
>
> Key: YARN-2886
> URL: https://issues.apache.org/jira/browse/YARN-2886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> This JIRA is about estimating the waiting time of each NM queue.
> Having these estimates is crucial for the distributed scheduling of container 
> requests, as it allows the LocalRM to decide in which NMs to queue the 
> queuable container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5414) Integrate NodeQueueLoadMonitor with ClusterNodeTracker

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5414:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Integrate NodeQueueLoadMonitor with ClusterNodeTracker
> --
>
> Key: YARN-5414
> URL: https://issues.apache.org/jira/browse/YARN-5414
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: container-queuing, distributed-scheduling, scheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> The {{ClusterNodeTracker}} tracks the states of clusterNodes and provides 
> convenience methods like sort and filter.
> The {{NodeQueueLoadMonitor}} should use the {{ClusterNodeTracker}} instead of 
> maintaining its own data-structure of node information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5688) Make allocation of opportunistic containers asynchronous

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5688:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5542

> Make allocation of opportunistic containers asynchronous
> 
>
> Key: YARN-5688
> URL: https://issues.apache.org/jira/browse/YARN-5688
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> In the current implementation of the 
> {{OpportunisticContainerAllocatorAMService}}, we synchronously perform the 
> allocation of opportunistic containers. This results in "blocking" the 
> service at the RM when scheduling the opportunistic containers.
> The {{OpportunisticContainerAllocator}} should instead asynchronously run as 
> a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-14 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666401#comment-15666401
 ] 

Konstantinos Karanasos commented on YARN-4597:
--

Thanks for the updated patch, [~asuresh]. Looks good to me.
Some final comments below... All are minor, so up to you to address (I would 
only "insist" about the last one).

- In the {{ContainerScheduler}}:
-- In the comment for the runningContainers, let's mention that these are the 
running containers, including the containers that are in the process of 
transitioning from the SCHEDULED to the RUNNING state. I think the rest are 
details that might be confusing.
-- In the {{updateQueuingLimit}}, you can do an extra check of the form {{if 
(this.queuingLimit.getMaxQueueLength() < 
queuedOpportunisticContainers.size())}} to avoid calling the shedding if the 
queue is not long enough. This might often be the case if the NM has imposed a 
small queue size.
-- I was thinking that, although less likely than before, the fields of the 
{{OpportunisticContainersStatus()}} can still be updated during the 
{{getOpportunisticContainersStatus()}}. To avoid synchronization, we could set 
the fields using an event, and then in the 
{{getOpportunisticContainersStatus()}} we would just return the object. But if 
you think it is too much, we can leave it as is.
-- In the {{onContainerCompleted}}, a container can belong either to queued 
guaranteed, to queued opportunistic or to running. So, you could avoid doing 
the remove from all lists once found in one of them.

- In the {{YarnConfiguration}}, let's include in a comment that the max queue 
length coming from the RM is the globally max queue length.

- In the {{SchedulerNode}}, I still suggest to put the {{++numContainers}} and 
the {{--numContainers}} inside the if statements. If I remember well, these 
fields are used for the web UI, so there will be a disconnect between the 
resources used (referring only to guaranteed containers) and the number of 
containers (referring to both guaranteed and opportunistic at the moment). The 
stats for the opportunistic containers are carried by the 
opportunisticContainersStatus, so we are good with reporting them too.

Again, all comments are minor. +1 for the patch and thanks for all the work!

> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, 
> YARN-4597.012.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-11 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658402#comment-15658402
 ] 

Konstantinos Karanasos commented on YARN-4597:
--

Here are some comments on the {{ContainerScheduler}}:

- {{queuedOpportunisticContainers}} will have concurrency issues. We are 
updating it when containers arrive but also in the 
{{shedQueuedOpportunisticContainers}}.

- {{queuedGuaranteedContainers}} and {{queuedOpportunisticContainers}}: I think 
we should use queues. I don't think we retrieve the container by the key 
anywhere either ways.

- {{oppContainersMarkedForKill}}: could be a Set, right?

- {{scheduledToRunContainers}} are containers that are either already running 
or are going to run very soon (transitioning from SCHEDULED to RUNNING state). 
Name is a bit misleading, because it sounds like they are only the ones 
belonging to the second category. I would rather say {{runningContainers}} and 
specify in a comment that they might not be running at this very moment but 
will be running very soon.

- In the {{onContainerCompleted()}}, the 
{{scheduledToRunContainers.remove(container.getContainerId())}} and the 
{{startPendingContainers()}} can go inside the if statement above. If the 
container was not running and no resources were freed up, we don't need to call 
the {{startPendingContainers()}}.

- fields of the {{opportunisticContainersStatus}} are set in different places. 
Due to that, when we call {{getOpportunisticContainersStatus()}} we may see an 
inconsistent object. Let's set the fields only in the 
{{getOpportunisticContainersStatus()}}.

- line 252, indeed we can now do extraOpportContainersToKill -> 
opportContainersToKill, as Karthik mentioned at a comment.

- line 87: increase -> increases

- {{shedQueuedOpportunisticContainers}}: 
-- {{numAllowed}} is the number of allowed containers in the queue. Instead, we 
are killing numAllowed containers. In other words, we should not kill 
numAllowed, but {{queuedOpportunisticContainers.size() - numAllowed}}.
-- "Container Killed to make room for Guaranteed Container." -> "Container 
killed to meet NM queuing limits". Instead of kill, you can also say de-queued.


> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-10 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655975#comment-15655975
 ] 

Konstantinos Karanasos commented on YARN-4597:
--

Thanks for working on this, [~asuresh]! I am sending some first comments. I 
have not yet looked at the {{ContainerScheduler}} -- I will do that tomorrow.

- The {{Container}} has two new methods ({{sendLaunchEvent}} and 
{{sendKillEvent}}), which are public and are not following the design of the 
rest of the code that keeps such methods private and calls them through 
transitions in the {{ContainerImpl}}. Let's try to use the existing design if 
possible.

- In {{RMNodeImpl}}:
-- Instead of using the {{launchedContainers}} for both the launched and the 
queued, we might want to split it in two: one for the launched and one for the 
queued containers.
-- I think we should not add opportunistic containers to the 
{{launchContainers}}. If we do, they will be added to the 
{{newlyLaunchedContainers}}, then to the {{nodeUpdateQueue}}, and, if I am not 
wrong, they will be propagated to the schedulers for the guaranteed containers, 
which will create problems. I have to look at it a bit more, but my hunch is 
that we should avoid doing it. Even if it does not affect the resource 
accounting, I don't see any advantage to adding them.

- In the {{OpportunisticContainerAllocatorAMService}} we are now calling the 
{{SchedulerNode::allocate}}, and then we do not update the used resources, but 
we do update some other counters, which leads to inconsistencies. For example, 
when releasing a container, I think at the moment we are not calling the 
release of the {{SchedulerNode}}, which means that the container count will 
become inconsistent.
-- Instead, I suggest to add some counters for opportunistic containers at the 
{{SchedulerNode}}, both for the number of containers and for the resources 
used. In this case, we need to make sure that those resources are released too.

- Maybe as part of a different JIRA, we should at some point extend the 
{{container.metrics}} in the {{ContainerImpl}} to keep track of the 
scheduled/queued containers.

h6. Nits:
- There seem to be two redundant parameters at {{YarnConfiguration}} at the 
moment: {{NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH}} and 
{{NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH}}. If I am not missing 
something, we should keep one of the two.
- {{yarn-default.xml}}: numbed -> number (in a comment)
- {{TestNodeManagerResync}}: I think it is better to use one of the existing 
methods for waiting to get to the RUNNING state.
- In {{Container}}/{{ContainerImpl}} and all the associated classes, I would 
suggest to rename {{isMarkedToKill}} to {{isMarkedForKilling}}. I know it is 
minor, but it is more self-explanatory.

I will send more comments once I check the {{ContainerScheduler}}. 
Also, let's stress-test the code in a cluster before committing to make sure 
everything is good. I can help with that.


> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-09 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652246#comment-15652246
 ] 

Konstantinos Karanasos commented on YARN-4597:
--

Hi [~jianhe]. Yes, I will check the patch today.

> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-09 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651784#comment-15651784
 ] 

Konstantinos Karanasos commented on YARN-5823:
--

Thanks for the review and the commit, [~asuresh]!

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch, 
> YARN-5823.003.patch, YARN-5823.004.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5833) Add validation to ensure default ports are unique in Configuration

2016-11-09 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651781#comment-15651781
 ] 

Konstantinos Karanasos commented on YARN-5833:
--

Thanks [~asuresh], indeed those tests were unfortunately not kicked off by 
Jenkins...

> Add validation to ensure default ports are unique in Configuration
> --
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5833.003.addendum-2.patch, 
> YARN-5833.003.addendum.patch, YARN-5833.003.patch, YARN-5883.001.patch, 
> YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5833) Add validation to ensure default ports are unique in Configuration

2016-11-08 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5833:
-
Attachment: YARN-5833.003.addendum.patch

Thanks for catching this, [~liuml07].
We had compiled it with Java 8. 
Attaching addendum patch that fixes the problem.

> Add validation to ensure default ports are unique in Configuration
> --
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5833.003.addendum.patch, YARN-5833.003.patch, 
> YARN-5883.001.patch, YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5833) Add validation to ensure default ports are unique in Configuration

2016-11-08 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649277#comment-15649277
 ] 

Konstantinos Karanasos commented on YARN-5833:
--

[~liuml07], I have checked it on trunk.. What error are you getting on branch-2?

> Add validation to ensure default ports are unique in Configuration
> --
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5833.003.patch, YARN-5883.001.patch, 
> YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5833) Add validation to ensure default ports are unique in Configuration

2016-11-08 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649096#comment-15649096
 ] 

Konstantinos Karanasos commented on YARN-5833:
--

Thanks for reviewing and committing the patch, [~subru]!

> Add validation to ensure default ports are unique in Configuration
> --
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5833.003.patch, YARN-5883.001.patch, 
> YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-08 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: YARN-5823.004.patch

Attaching the right patch.

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch, 
> YARN-5823.003.patch, YARN-5823.004.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-08 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: (was: YARN-5823.004.patch)

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch, 
> YARN-5823.003.patch, YARN-5823.004.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-08 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: YARN-5823.004.patch

Rebasing against trunk and fixing the failing test cases.

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch, 
> YARN-5823.003.patch, YARN-5823.004.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: YARN-5823.003.patch

Adding new version of the patch, in which I am calling {{pullNMTokens()}} only 
once, following [~asuresh]'s suggestion.

I also included a new test in {{TestOpportunisticContainerAllocation}}, which 
would fail without the present patch.

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch, 
> YARN-5823.003.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5833) Change default port for AMRMProxy

2016-11-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5833:
-
Attachment: YARN-5833.003.patch

Attaching new version of the patch in which I fixed the checkstyle issue.

Also, I ran the new test with the previous parameter of the AMRMProxy port, 
which was causing a collision, and got the following output:
{noformat}java.lang.AssertionError: Parameters DEFAULT_AMRM_PROXY_PORT and 
DEFAULT_NM_COLLECTOR_SERVICE_PORT are using the same default value!{noformat}

With the port change, the test passes successfully.

> Change default port for AMRMProxy
> -
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5833.003.patch, YARN-5883.001.patch, 
> YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5833) Change default port for AMRMProxy

2016-11-07 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5833:
-
Attachment: YARN-5883.002.patch

Thanks for the feedback, [~subru].

I added a new test method in {{TestConfigurationFieldsBase}} that checks for 
collision of default values.
Each subclass of {{TestConfigurationFieldsBase}} can specify a set of filter 
strings. Then the above method goes over each of these filters and makes sure 
that there is no collision between the values of the default parameters that 
contain this filter in their name.
The {{TestYarnConfigurationFields}} initialize method adds the "_PORT" filter 
in the filter set to check for default port collision.

At the moment the method that adds the filters in the 
{{TestYarnConfigurationFields}} is private. Let me know if you think it's 
better to move it to the base class and have the YARN specific one override it.

> Change default port for AMRMProxy
> -
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5883.001.patch, YARN-5883.002.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5688) Make allocation of opportunistic containers asynchronous

2016-11-04 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos reassigned YARN-5688:


Assignee: Konstantinos Karanasos

> Make allocation of opportunistic containers asynchronous
> 
>
> Key: YARN-5688
> URL: https://issues.apache.org/jira/browse/YARN-5688
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> In the current implementation of the 
> {{OpportunisticContainerAllocatorAMService}}, we synchronously perform the 
> allocation of opportunistic containers. This results in "blocking" the 
> service at the RM when scheduling the opportunistic containers.
> The {{OpportunisticContainerAllocator}} should instead asynchronously run as 
> a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-04 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637295#comment-15637295
 ] 

Konstantinos Karanasos commented on YARN-5823:
--

Thanks for checking the patch, [~asuresh].
I like what you propose, but the problem would be that the opportunistic 
allocation has to happen strictly before the guaranteed allocation for that to 
work.
At the current patch, I am doing the guaranteed allocation first, since it is 
non-blocking.
As you say, as part of YARN-5688 we should revisit the order of steps once both 
guaranteed and opportunistic allocations will be asynchronous.
Makes sense?

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5833) Change default port for AMRMProxy

2016-11-03 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5833:
-
Attachment: YARN-5883.001.patch

Attaching patch.

> Change default port for AMRMProxy
> -
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5883.001.patch
>
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-03 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: YARN-5823.002.patch

Attaching new patch -- fixing findbug and checkstyle issues.

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-5823.001.patch, YARN-5823.002.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5377) TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in trunk

2016-11-03 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634701#comment-15634701
 ] 

Konstantinos Karanasos edited comment on YARN-5377 at 11/3/16 11:59 PM:


The problem with the test is that the container was moving fast from the DONE 
to the CONTAINER_CLEANUP_AFTER_KILL state, and the DONE state was not observed 
by the {{waitForNMContainerState}} method of the {{BaseContainerManagerTest}}.

I added a new {{waitForNMContainerState}} method that takes as input a list of 
final container states, instead of a single one like before. When any of the 
states of this list is reached, the {{waitForNMContainerState}} exits 
successfully.


was (Author: kkaranasos):
The problem with the test is that the container was moving fast from the DONE 
to the CONTAINER_CLEANUP_AFTER_KILL, and the DONE state was not observed by the 
{{waitForNMContainerState}} method of the {{BaseContainerManagerTest}}.

I added a new {{waitForNMContainerState}} method that takes as input a list of 
final container states, instead of a single one like before. When any of the 
states of this list is reached, the {{waitForNMContainerState}} exits 
successfully.

> TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in 
> trunk
> --
>
> Key: YARN-5377
> URL: https://issues.apache.org/jira/browse/YARN-5377
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5377.001.patch
>
>
> Test case fails jenkin build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/12228/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt]
> {noformat}
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 134.586 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
> testKillMultipleOpportunisticContainers(org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager)
>   Time elapsed: 32.134 sec  <<< FAILURE!
> java.lang.AssertionError: ContainerState is not correct (timedout) 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:363)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager.testKillMultipleOpportunisticContainers(TestQueuingContainerManager.java:470)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5377) TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in trunk

2016-11-03 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5377:
-
Attachment: YARN-5377.001.patch

The problem with the test is that the container was moving fast from the DONE 
to the CONTAINER_CLEANUP_AFTER_KILL, and the DONE state was not observed by the 
{{waitForNMContainerState}} method of the {{BaseContainerManagerTest}}.

I added a new {{waitForNMContainerState}} method that takes as input a list of 
final container states, instead of a single one like before. When any of the 
states of this list is reached, the {{waitForNMContainerState}} exits 
successfully.

> TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in 
> trunk
> --
>
> Key: YARN-5377
> URL: https://issues.apache.org/jira/browse/YARN-5377
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5377.001.patch
>
>
> Test case fails jenkin build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/12228/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt]
> {noformat}
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 134.586 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
> testKillMultipleOpportunisticContainers(org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager)
>   Time elapsed: 32.134 sec  <<< FAILURE!
> java.lang.AssertionError: ContainerState is not correct (timedout) 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:363)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager.testKillMultipleOpportunisticContainers(TestQueuingContainerManager.java:470)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5833) Change default port for AMRMProxy

2016-11-03 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5833:


 Summary: Change default port for AMRMProxy
 Key: YARN-5833
 URL: https://issues.apache.org/jira/browse/YARN-5833
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


The default port for the AMRMProxy coincides with the one for the Collector 
Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5833) Change default port for AMRMProxy

2016-11-03 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos reassigned YARN-5833:


Assignee: Konstantinos Karanasos

> Change default port for AMRMProxy
> -
>
> Key: YARN-5833
> URL: https://issues.apache.org/jira/browse/YARN-5833
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> The default port for the AMRMProxy coincides with the one for the Collector 
> Service (port 8048). Will use a different port for the AMRMProxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-11-03 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634527#comment-15634527
 ] 

Konstantinos Karanasos commented on YARN-2995:
--

Regarding the remaining issues:
* The checkstyle issue is about a method that takes more than 7 parameters in 
one of the test classes. That was already the case for that method; I just 
added some more parameters.
* The javadoc issue, as I already explained in a comment above, is related to 
Java 8 complaining about using '_' as identifiers. This is used in multiple 
places in the Web UI classes, and should be treated in a separate JIRA.
* The unit test issue regarding {{TestQueuingContainerManager}} is unrelated to 
the present JIRA and is tracked in YARN-5377.
* There is a build issue in sls when running the corresponding tests, but it 
might be a Jenkins issue, since it builds fine for me locally. Just kicked-off 
Jenkins again to see if the problem persists.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
>Priority: Blocker
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch, YARN-2995.004.patch, all-nodes.png, all-nodes.png, 
> opp-container.png
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-02 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Attachment: YARN-5823.001.patch

Attaching patch.

> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5823.001.patch
>
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-02 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5823:
-
Description: 
At the moment, when an {{AllocateRequest}} contains only opportunistic 
{{ResourceRequests}}, the updated NMTokens are not properly added to the 
{{AllocateResponse}}.
In such a case the AM does not get back the needed NMTokens that are required 
to start the opportunistic containers at the respective nodes.

  was:
At the moment, when an {{AllocateRequest}} containers only opportunistic 
{{ResourceRequests}}, the updated NMTokens are not properly added to the 
{{AllocateResponse}}.
In such a case the AM does not get back the needed NMTokens that are required 
to start the opportunistic containers at the respective nodes.


> Update NMTokens in case of requests with only opportunistic containers
> --
>
> Key: YARN-5823
> URL: https://issues.apache.org/jira/browse/YARN-5823
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> At the moment, when an {{AllocateRequest}} contains only opportunistic 
> {{ResourceRequests}}, the updated NMTokens are not properly added to the 
> {{AllocateResponse}}.
> In such a case the AM does not get back the needed NMTokens that are required 
> to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5823) Update NMTokens in case of requests with only opportunistic containers

2016-11-02 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5823:


 Summary: Update NMTokens in case of requests with only 
opportunistic containers
 Key: YARN-5823
 URL: https://issues.apache.org/jira/browse/YARN-5823
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


At the moment, when an {{AllocateRequest}} containers only opportunistic 
{{ResourceRequests}}, the updated NMTokens are not properly added to the 
{{AllocateResponse}}.
In such a case the AM does not get back the needed NMTokens that are required 
to start the opportunistic containers at the respective nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-11-02 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: YARN-2995.004.patch

Adding new version of the patch.
Rebased against trunk, fixed some more issues, and addressed the unit test 
failures.

Note that there is a javadoc issue regarding using '_' as an identifier" 
(related to Java 8). I did not fix that, because it is actually used in 
multiple classes in the Web UI, and I followed the same style as in the rest of 
the code. I assume this should be fixed in all places at some point.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch, YARN-2995.004.patch, all-nodes.png, all-nodes.png, 
> opp-container.png
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-11-02 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: all-nodes.png

Attaching new screenshot after some final fixes.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch, all-nodes.png, all-nodes.png, opp-container.png
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-31 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: opp-container.png
all-nodes.png

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch, all-nodes.png, opp-container.png
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-31 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: (was: all-nodes.png)

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-31 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: (was: opp-container.png)

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-31 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: opp-container.png
all-nodes.png

Attaching two screenshots. The one is from the nodes page, showing an instance 
of the cluster with both guaranteed and opportunistic containers running, as 
well as some additional containers queued at the node.
The second shows the details of a specific container, where the execution type 
is added ("OPPORTUNISTIC" in the specific case).

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch, all-nodes.png, opp-container.png
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-31 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: YARN-2995.003.patch

Adding new version of the patch.
Fixed some more problems, the checkstyle issues, and added the execution type 
information at the container's page.
The unit test that was failing looks unrelated.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch, 
> YARN-2995.003.patch
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of "aclSubmitApps" is null in fair-scheduler.xml

2016-10-31 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623286#comment-15623286
 ] 

Konstantinos Karanasos commented on YARN-3645:
--

Thanks for the new patch, [~gliptak].

I see there is one test failing. Can you please check if that is related?
Otherwise, the patch looks good to me.

bq. Elements "aclSubmitApps", "aclAdministerApps", "aclAdministerReservations", 
"aclListReservations", "aclSubmitReservations" do not call trim() in the 
current code. Are these also expected to call trim()?
[~kasha] If those properties should also call trim(), then we can push trim() 
inside the readFieldText() method to simplify the code.

Other than that, and after double-checking that the test is not related, let's 
commit the patch.

> ResourceManager can't start success if  attribute value of "aclSubmitApps" is 
> null in fair-scheduler.xml
> 
>
> Key: YARN-3645
> URL: https://issues.apache.org/jira/browse/YARN-3645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha2
>Reporter: zhoulinlin
>Assignee: Gabor Liptak
>  Labels: oct16-easy
> Attachments: YARN-3645.1.patch, YARN-3645.2.patch, YARN-3645.3.patch, 
> YARN-3645.4.patch, YARN-3645.5.patch, YARN-3645.patch
>
>
> The "aclSubmitApps" is configured in fair-scheduler.xml like below:
> 
> 
>  
> The resourcemanager log:
> {noformat}
> 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
> Service ResourceManager failed in state INITED; cause: 
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
> Caused by: java.io.IOException: Failed to initialize FairScheduler
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
>   ... 9 more
> 2015-05-14 12:59:48,623 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
> to standby state
> 2015-05-14 12:59:48,623 INFO 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
> transitionToStandbyIn
> 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
> stopping the service ResourceManager : java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>   at 
> 

[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of "aclSubmitApps" is null in fair-scheduler.xml

2016-10-27 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613452#comment-15613452
 ] 

Konstantinos Karanasos commented on YARN-3645:
--

I just checked your patch, [~gliptak].
The checks you added seem useful, let's try to close this.

Can you please rebase the patch to current trunk?

Also, some additional comments:
# Since we are calling {{text.trim()}} in all cases, let's add it inside the 
{{readFieldText()}} method, i.e., you can do {{return 
firstChild.getData().trim()}}.
# Instead of passing each time the tag name to the {{readFieldText()}}, you can 
instead use a single {{Element field}} paramater, and call the 
{{field.getTagName()}} to get the tag name inside the {{readFieldText()}}.

> ResourceManager can't start success if  attribute value of "aclSubmitApps" is 
> null in fair-scheduler.xml
> 
>
> Key: YARN-3645
> URL: https://issues.apache.org/jira/browse/YARN-3645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.2
>Reporter: zhoulinlin
>Assignee: Gabor Liptak
>  Labels: oct16-easy
> Attachments: YARN-3645.1.patch, YARN-3645.2.patch, YARN-3645.3.patch, 
> YARN-3645.4.patch, YARN-3645.patch
>
>
> The "aclSubmitApps" is configured in fair-scheduler.xml like below:
> 
> 
>  
> The resourcemanager log:
> {noformat}
> 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
> Service ResourceManager failed in state INITED; cause: 
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
> Caused by: java.io.IOException: Failed to initialize FairScheduler
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
>   ... 9 more
> 2015-05-14 12:59:48,623 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
> to standby state
> 2015-05-14 12:59:48,623 INFO 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
> transitionToStandbyIn
> 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
> stopping the service ResourceManager : java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>   at 
> 

[jira] [Commented] (YARN-3679) Add documentation for timeline server filter ordering

2016-10-27 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613433#comment-15613433
 ] 

Konstantinos Karanasos commented on YARN-3679:
--

[~xgong], let's try to close this patch...
Do you think it is still applicable/useful?
I guess it is not needed for the 3.0 version (since we will be using the new 
version of the Timeline Server), but is it needed for branch-2?

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Mit Desai
>Assignee: Mit Desai
>  Labels: oct16-easy
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-3679) Add documentation for timeline server filter ordering

2016-10-27 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613433#comment-15613433
 ] 

Konstantinos Karanasos edited comment on YARN-3679 at 10/27/16 10:14 PM:
-

[~xgong], let's try to close this JIRA...
Do you think it is still applicable/useful?
I guess it is not needed for the 3.0 version (since we will be using the new 
version of the Timeline Server), but is it needed for branch-2?


was (Author: kkaranasos):
[~xgong], let's try to close this patch...
Do you think it is still applicable/useful?
I guess it is not needed for the 3.0 version (since we will be using the new 
version of the Timeline Server), but is it needed for branch-2?

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Mit Desai
>Assignee: Mit Desai
>  Labels: oct16-easy
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3679) Add documentation for timeline server filter ordering

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-3679:
-
Component/s: timelineserver

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Mit Desai
>Assignee: Mit Desai
>  Labels: oct16-easy
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3679) Add documentation for timeline server filter ordering

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-3679:
-
Labels: oct16-easy  (was: )

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Mit Desai
>Assignee: Mit Desai
>  Labels: oct16-easy
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3645) ResourceManager can't start success if attribute value of "aclSubmitApps" is null in fair-scheduler.xml

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-3645:
-
Labels: oct16-easy  (was: )

> ResourceManager can't start success if  attribute value of "aclSubmitApps" is 
> null in fair-scheduler.xml
> 
>
> Key: YARN-3645
> URL: https://issues.apache.org/jira/browse/YARN-3645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.2
>Reporter: zhoulinlin
>Assignee: Gabor Liptak
>  Labels: oct16-easy
> Attachments: YARN-3645.1.patch, YARN-3645.2.patch, YARN-3645.3.patch, 
> YARN-3645.4.patch, YARN-3645.patch
>
>
> The "aclSubmitApps" is configured in fair-scheduler.xml like below:
> 
> 
>  
> The resourcemanager log:
> 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
> Service ResourceManager failed in state INITED; cause: 
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
> Caused by: java.io.IOException: Failed to initialize FairScheduler
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>   ... 7 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
>   ... 9 more
> 2015-05-14 12:59:48,623 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
> to standby state
> 2015-05-14 12:59:48,623 INFO 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
> transitionToStandbyIn
> 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
> stopping the service ResourceManager : java.lang.NullPointerException
> java.lang.NullPointerException
>   at 
> com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
> 2015-05-14 12:59:48,623 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
> to initialize FairScheduler
>   at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>   

[jira] [Updated] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2965:
-
Labels: oct16-hard  (was: )

> Enhance Node Managers to monitor and report the resource usage on machines
> --
>
> Key: YARN-2965
> URL: https://issues.apache.org/jira/browse/YARN-2965
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Robert Grandl
>Assignee: Inigo Goiri
>  Labels: oct16-hard
> Attachments: YARN-2965.000.patch, YARN-2965.001.patch, 
> YARN-2965.002.patch, ddoc_RT.docx
>
>
> This JIRA is about augmenting Node Managers to monitor the resource usage on 
> the machine, aggregates these reports and exposes them to the RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1743) Statically generate event diagrams across components

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-1743:
-
Description: 
We propose to statically generate the event diagrams across components.
This is similar to the generation of diagrams with state transitions within a 
component that we already do today.

The goal is to be able to visualize the interactions through events across 
different components.

  was:
Helps to annotate the transitions with (start-state, end-state) pair and the 
events with (source, destination) pair.

Not just readability, we may also use them to generate the event diagrams 
across components.

Not a blocker for 0.23, but let's see.


> Statically generate event diagrams across components
> 
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation, oct16-hard
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> We propose to statically generate the event diagrams across components.
> This is similar to the generation of diagrams with state transitions within a 
> component that we already do today.
> The goal is to be able to visualize the interactions through events across 
> different components.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1743) Statically generate event diagrams across components

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-1743:
-
Summary: Statically generate event diagrams across components  (was: 
Decorate event transitions and the event-types with their behaviour)

> Statically generate event diagrams across components
> 
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation, oct16-hard
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2016-10-27 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612858#comment-15612858
 ] 

Konstantinos Karanasos commented on YARN-1743:
--

We had an offline discussion with [~chris.douglas] and [~vinodkv].
We agreed that the current status of this JIRA is not very useful in the sense 
that we need to manually annotate all event type (currently only the 
{{ApplicationEventType}} is annotated in the patch). Then, we will have to 
manually maintain those annotations with the risk that they will become 
inconsistent very soon.

I am cancelling the current patch and will repurpose the JIRA to statically 
generate a graph of the transitions for each event type, similar to the way we 
generate the graph for the state transitions.

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation, oct16-hard
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-1743:
-
Issue Type: New Feature  (was: Bug)

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation, oct16-hard
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-1743:
-
Labels: documentation oct16-hard  (was: documentation)

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation, oct16-hard
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2618) Avoid over-allocation of disk resources

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2618:
-
Labels: BB2015-05-TBR oct16-hard  (was: BB2015-05-TBR)

> Avoid over-allocation of disk resources
> ---
>
> Key: YARN-2618
> URL: https://issues.apache.org/jira/browse/YARN-2618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wei Yan
>Assignee: Wei Yan
>  Labels: BB2015-05-TBR, oct16-hard
> Attachments: YARN-2618-1.patch, YARN-2618-2.patch, YARN-2618-3.patch, 
> YARN-2618-4.patch, YARN-2618-5.patch, YARN-2618-6.patch, YARN-2618-7.patch
>
>
> Subtask of YARN-2139. 
> This should include
> - Add API support for introducing disk I/O as the 3rd type resource.
> - NM should report this information to the RM
> - RM should consider this to avoid over-allocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2618) Avoid over-allocation of disk resources

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2618:
-
Component/s: resourcemanager

> Avoid over-allocation of disk resources
> ---
>
> Key: YARN-2618
> URL: https://issues.apache.org/jira/browse/YARN-2618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wei Yan
>Assignee: Wei Yan
>  Labels: BB2015-05-TBR
> Attachments: YARN-2618-1.patch, YARN-2618-2.patch, YARN-2618-3.patch, 
> YARN-2618-4.patch, YARN-2618-5.patch, YARN-2618-6.patch, YARN-2618-7.patch
>
>
> Subtask of YARN-2139. 
> This should include
> - Add API support for introducing disk I/O as the 3rd type resource.
> - NM should report this information to the RM
> - RM should consider this to avoid over-allocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3518) default rm/am expire interval should not be smaller than default resourcemanager connect wait time

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-3518:
-
Summary: default rm/am expire interval should not be smaller than default 
resourcemanager connect wait time  (was: default rm/am expire interval should 
not less than default resourcemanager connect wait time)

> default rm/am expire interval should not be smaller than default 
> resourcemanager connect wait time
> --
>
> Key: YARN-3518
> URL: https://issues.apache.org/jira/browse/YARN-3518
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-easy
> Attachments: YARN-3518.001.patch, YARN-3518.002.patch, 
> YARN-3518.003.patch, YARN-3518.004.patch
>
>
> take am for example, if am can't connect to RM, after am expire (600s), RM 
> relaunch am, and there will be two am at the same time util resourcemanager 
> connect max wait time(900s) passed.
> DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS =  15 * 60 * 1000;
> DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60;
> DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2876:
-
Component/s: resourcemanager
 fairscheduler

> In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
> subqueues
> 
>
> Key: YARN-2876
> URL: https://issues.apache.org/jira/browse/YARN-2876
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: oct16-easy
> Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, 
> YARN-2876.v3.patch, YARN-2876.v4.patch, screenshot-1.png
>
>
> If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
> Scheduler UI will display the entire cluster capacity as its maxResource 
> instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2016-10-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2876:
-
Labels: oct16-easy  (was: )

> In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
> subqueues
> 
>
> Key: YARN-2876
> URL: https://issues.apache.org/jira/browse/YARN-2876
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: oct16-easy
> Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, 
> YARN-2876.v3.patch, YARN-2876.v4.patch, screenshot-1.png
>
>
> If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
> Scheduler UI will display the entire cluster capacity as its maxResource 
> instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-10-26 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: YARN-2995.002.patch

Uploading new version of the patch.
Rebasing against trunk, addressing [~asuresh]'s comments, fixing existing test 
cases, adding new test cases.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch, YARN-2995.002.patch
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5688) Make allocation of opportunistic containers asynchronous

2016-09-28 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5688:


 Summary: Make allocation of opportunistic containers asynchronous
 Key: YARN-5688
 URL: https://issues.apache.org/jira/browse/YARN-5688
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


In the current implementation of the 
{{OpportunisticContainerAllocatorAMService}}, we synchronously perform the 
allocation of opportunistic containers. This results in "blocking" the service 
at the RM when scheduling the opportunistic containers.
The {{OpportunisticContainerAllocator}} should instead asynchronously run as a 
separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5687) Refactor TestOpportunisticContainerAllocation to extend TestAMRMClient

2016-09-28 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5687:


 Summary: Refactor TestOpportunisticContainerAllocation to extend 
TestAMRMClient
 Key: YARN-5687
 URL: https://issues.apache.org/jira/browse/YARN-5687
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


Since {{TestOpportunisticContainerAllocation}} shares a lot of code with the 
{{TestAMRMClient}}, we should refactor the former, making it a subclass of the 
latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5486) Update OpportunisticContainerAllocatorAMService::allocate method to handle OPPORTUNISTIC container requests

2016-09-28 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5486:
-
Attachment: YARN-5486.004.patch

Adding new patch.
Fixed remaining checkstyle issues and [~asuresh]'s comments.
The testcase is not failing locally for me and does not seem related.

[~asuresh], I will create JIRAs to track the two issues you mentioned.
Good point about the opportunistic container allocation. We should make it 
asynchronous.

> Update OpportunisticContainerAllocatorAMService::allocate method to handle 
> OPPORTUNISTIC container requests
> ---
>
> Key: YARN-5486
> URL: https://issues.apache.org/jira/browse/YARN-5486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5486.001.patch, YARN-5486.002.patch, 
> YARN-5486.003.patch, YARN-5486.004.patch
>
>
> YARN-5457 refactors the Distributed Scheduling framework to move the 
> container allocator to yarn-server-common.
> This JIRA proposes to update the allocate method in the new AM service to use 
> the OpportunisticContainerAllocator to allocate opportunistic containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2995) Enhance UI to show cluster resource utilization of various container types

2016-09-28 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2995:
-
Attachment: YARN-2995.001.patch

Adding first version of the patch.

> Enhance UI to show cluster resource utilization of various container types
> --
>
> Key: YARN-2995
> URL: https://issues.apache.org/jira/browse/YARN-2995
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2995.001.patch
>
>
> This JIRA proposes to extend the Resource manager UI to show how cluster 
> resources are being used to run *guaranteed start* and *queueable* 
> containers.  For example, a graph that shows over time, the fraction of  
> running containers that are *guaranteed start* and the fraction of running 
> containers that are *queueable*. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5486) Update OpportunisticContainerAllocatorAMService::allocate method to handle OPPORTUNISTIC container requests

2016-09-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5486:
-
Attachment: YARN-5486.003.patch

Uploading new version of patch, fixing compile, unit test and checkstyle issues.

> Update OpportunisticContainerAllocatorAMService::allocate method to handle 
> OPPORTUNISTIC container requests
> ---
>
> Key: YARN-5486
> URL: https://issues.apache.org/jira/browse/YARN-5486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5486.001.patch, YARN-5486.002.patch, 
> YARN-5486.003.patch
>
>
> YARN-5457 refactors the Distributed Scheduling framework to move the 
> container allocator to yarn-server-common.
> This JIRA proposes to update the allocate method in the new AM service to use 
> the OpportunisticContainerAllocator to allocate opportunistic containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5486) Update OpportunisticContainerAllocatorAMService::allocate method to handle OPPORTUNISTIC container requests

2016-09-27 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5486:
-
Attachment: YARN-5486.002.patch

Rebasing against trunk and adding new patch, also including some more 
changes/fixes.

Thanks for the feedback, [~subru].
Regarding the _LinkedHashMap_ in the {{OpportunisticContainerContext}}, we 
actually need it to keep the ordering of the nodes. Least loaded nodes should 
come first when iterating over the hashmap, as they should be preferred when 
placing opportunistic containers.
I added checks for the ContainerTypes in 
{{TestOpportunisticContainersAllocation}}, as you suggested. 
I suggest to keep the existing sleep logic for now, if it's OK, since it does 
not seem to make the code much cleaner in the particular cases it is used (also 
chatted with [~chris.douglas] about this).

> Update OpportunisticContainerAllocatorAMService::allocate method to handle 
> OPPORTUNISTIC container requests
> ---
>
> Key: YARN-5486
> URL: https://issues.apache.org/jira/browse/YARN-5486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
> Attachments: YARN-5486.001.patch, YARN-5486.002.patch
>
>
> YARN-5457 refactors the Distributed Scheduling framework to move the 
> container allocator to yarn-server-common.
> This JIRA proposes to update the allocate method in the new AM service to use 
> the OpportunisticContainerAllocator to allocate opportunistic containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5646) Documentation for scheduling of OPPORTUNISTIC containers

2016-09-13 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5646:


 Summary: Documentation for scheduling of OPPORTUNISTIC containers
 Key: YARN-5646
 URL: https://issues.apache.org/jira/browse/YARN-5646
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


This is for adding documentation regarding the scheduling of OPPORTUNISTIC 
containers.
It includes both the centralized (YARN-5220) and the distributed (YARN-2877) 
scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5542) Scheduling of opportunistic containers

2016-08-19 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5542:
-
Description: 
This JIRA groups all efforts related to the scheduling of opportunistic 
containers. 
It includes the scheduling of opportunistic container through the central RM 
(YARN-5220), through distributed scheduling (YARN-2877), as well as the 
scheduling of containers based on actual node utilization (YARN-1011) and the 
container promotion/demotion (YARN-5085).

  was:
This JIRA groups all efforts related to the scheduling of opportunistic 
containers. 
It includes the scheduling of opportunistic container through the central RM 
(YARN-5220), through distributed scheduling (YARN-2877), as well as the 
scheduling of containers based on actual node utilization (YARN-1011).


> Scheduling of opportunistic containers
> --
>
> Key: YARN-5542
> URL: https://issues.apache.org/jira/browse/YARN-5542
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Konstantinos Karanasos
>
> This JIRA groups all efforts related to the scheduling of opportunistic 
> containers. 
> It includes the scheduling of opportunistic container through the central RM 
> (YARN-5220), through distributed scheduling (YARN-2877), as well as the 
> scheduling of containers based on actual node utilization (YARN-1011) and the 
> container promotion/demotion (YARN-5085).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5542) Scheduling of opportunistic containers

2016-08-19 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428894#comment-15428894
 ] 

Konstantinos Karanasos commented on YARN-5542:
--

We had some initial discussions with [~kasha] and [~elgoiri]. We will upload a 
design document, summarizing the whole effort.

> Scheduling of opportunistic containers
> --
>
> Key: YARN-5542
> URL: https://issues.apache.org/jira/browse/YARN-5542
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Konstantinos Karanasos
>
> This JIRA groups all efforts related to the scheduling of opportunistic 
> containers. 
> It includes the scheduling of opportunistic container through the central RM 
> (YARN-5220), through distributed scheduling (YARN-2877), as well as the 
> scheduling of containers based on actual node utilization (YARN-1011).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5542) Scheduling of opportunistic containers

2016-08-19 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5542:


 Summary: Scheduling of opportunistic containers
 Key: YARN-5542
 URL: https://issues.apache.org/jira/browse/YARN-5542
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Konstantinos Karanasos


This JIRA groups all efforts related to the scheduling of opportunistic 
containers. 
It includes the scheduling of opportunistic container through the central RM 
(YARN-5220), through distributed scheduling (YARN-2877), as well as the 
scheduling of containers based on actual node utilization (YARN-1011).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5541) Handling of opportunistic containers in the NM

2016-08-19 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5541:


 Summary: Handling of opportunistic containers in the NM
 Key: YARN-5541
 URL: https://issues.apache.org/jira/browse/YARN-5541
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Konstantinos Karanasos


I am creating this JIRA in order to group all tasks related to the management 
of opportunistic containers in the NMs, such as the queuing of containers, the 
pausing of containers and the prioritization of queued containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5457) Refactor DistributedScheduling framework to pull out common functionality

2016-08-08 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412841#comment-15412841
 ] 

Konstantinos Karanasos commented on YARN-5457:
--

Looks good to me too, thanks [~asuresh].
(Very minor suggestion: it might look better to rename 
{{OpportunisticContainersAllocatingAMService}} to 
{{OpportunisticContainersAllocatorAMService}} or simply 
{{OpportunisticContainersAMService}}).

> Refactor DistributedScheduling framework to pull out common functionality
> -
>
> Key: YARN-5457
> URL: https://issues.apache.org/jira/browse/YARN-5457
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5457.001.patch, YARN-5457.002.patch, 
> YARN-5457.003.patch
>
>
> Opening this JIRA to track the some refactoring missed in YARN-5113:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-05 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410310#comment-15410310
 ] 

Konstantinos Karanasos commented on YARN-4902:
--

[~leftnoteasy]:

bq. I can understand your proposal may look different from my guess above, we 
can discuss more once you have a more concrete design for that.
Yes, let's discuss about service planning once we add more details in the 
design document -- it will be easier for other people to get involved in the 
discussion too.

bq. I'm not care too much about if we should support cardinality via GUTS API 
or support anti-affinity via cardinality syntaxes. We should choose a more 
generic/extensible API which can support both.
Sounds good, we can continue the discussion in YARN-5478.

> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf, LRA-scheduling-design.v0.pdf, YARN-5468.prototype.patch
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-05 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410270#comment-15410270
 ] 

Konstantinos Karanasos commented on YARN-4902:
--

bq. LRA planning looks like an implementation.
LRA planning is much more than an implementation. Think of it as planning 
multiple applications at once. This is something that the scheduler cannot do, 
no matter what its implementation is.
Please give a look at YARN-1051 to see a similar use case for 
planning/admission control but in a constraint-free context.
I can give more details as I update the document. In any case, that does not 
block any of the changes that are required in the scheduler per se to support 
constraints.

bq. For cardinality, could you share a more detailed use case for that?
As you mention, an example would be to limit the number of hbase-masters in a 
node/rack or even the number of AMs in a node.
You could do it with resource isolation, but especially network isolation is 
really hard to get right, so until we reach that point, I think it would be 
great for applications to be able to express such constraints.

bq. It seems to me that cardinality is a special case of anti-affinity.
I would say that it is the other way around: affinity and anti-affinity is a 
special case of cardinality. If you say there is cardinality 1 for that node, 
it means you have anti-affinity for that node.
I agree that you can currently express it with your proposal, so we are just 
suggesting an alternative way that would be more succinct and we will not need 
to have different types of constraints, but just a single one.

> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf, LRA-scheduling-design.v0.pdf, YARN-5468.prototype.patch
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-05 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410209#comment-15410209
 ] 

Konstantinos Karanasos commented on YARN-4902:
--

Thanks for checking the design doc and the patch, and for the feedback, 
[~leftnoteasy].
Please find below some thoughts regarding the points your raised and some 
additional information.

bq. From the requirement's perspective, I didn't see new things, please remind 
me if I missed anything
Agreed that our basic requirements are similar, which is good because it means 
we are aligned. Some of the notions we are using might coincide with yours but 
have a different name (e.g., dynamic vs. allocation tags, although the scope of 
our dynamic tags is global and not application specific like yours), by virtue 
of the fact that we were designing things at the same time. We can agree on a 
common naming, not a problem.
What I would like to stretch as being different is mainly the LRA planning, 
some extensions to the constraints (along with a more succinct way of 
expressing them), as well as the ease of expressing inter-application 
constraints -- more details below.

*Constraints*
bq. The cardinality constraints is placement_set with maximum_concurrency 
constraint: see (4.3.3) Placement Strategy in my design doc.
If I am not wrong, the maxiumum_concurrency in your document corresponds to a 
single allocation/resource-request. Our min and max cardinality is across 
applications. For instance, in order to say "don't put more than 5 hbase 
servers (from any possible application) in a rack".

In general, as we showed in our design doc, you can use max and min 
cardinalities to also express affinity and anti-affinity constraints. This way 
we can have only a single type of constraints. What do you think?

bq. Will this patch support anti-affinity / affinity between apps? I uploaded 
my latest POC patch to YARN-1042, it supports affinity/anti-affinity for 
inter/intra apps. We can easily extend it to support intra/inter resource 
request within the app.
Yes, this is a major use case for us. The current patch can already support it. 
And this is why we want to make more use of the tags and of planning, since 
they would allow us to specify inter-app constraints without needing to know 
the app ID of the other job.

bq. Major logic of this patch depends on node label manager dynamic tag 
changes. First of all, I'm not sure if NLM works efficiently when node label 
changes rapidly (we could update label on node when allocate / release every 
container). And I'm not sure how you plan to avoid malicious application add 
labels. For example if a distributed shell application claims it is a "hbase 
master" just for fun, how to enforce cardinality logics like "only put 10 HBase 
masters in the rack"?
Good points.
For the scalability we have not seen any problems so far (we update tags at 
allocate/release), but we have not run very large-scale experiments -- I will 
update you on that.
For the malicious AM, I am not sure if the application would benefit from 
lying. But even if it does, we can use cluster-wide constraints to limit such 
AMs. Still, I agree more thought has to be given on this matter -- it's good 
you brought it up.

*Scheduling*
bq. It might be better to implement complex scheduling logics like 
affinity-between-apps and cardinality in a global scheduling way. (YARN-5139)
We will be more than happy to use any advancement in the scheduler that is 
available!
I totally believe that global scheduling (i.e., have an application-centric 
rather than node-centric scheduling) is much more appropriate and will give 
better results. We did not use it in our first patch, as it was not available, 
but we are happy to try it out.

*Planning*
bq. I'm not sure how LRA planner will look like, should it be a separate 
scheduler running in parallel? I didn't see your patch uses that approach.
The idea here is to be able to do more holistic placement decisions across 
applications. What if you place your HBase service in a way that does not let a 
subsequent Heron app be placed in the cluster at all?
We envision it to be outside of the scheduler, similar to the reservation 
system (YARN-1051).
Applications will also be able to submit multiple applications at once, and 
specify constraints among them.
It is not in the initial version of the patch.

*Suggestions*
bq. Could you take a look at global scheduling patch which I attached to 
YARN-5139 to see if it is possible to build new features added in your patch on 
top of the global scheduling framework? And also please share your thoughts 
about what's your overall feedbacks to the global scheduling framework like 
efficiency, extensibility, etc.
I will check the global scheduler, and as I said above, I'd be happy to use it.

bq. It will be better to design Java API for this ticket, both of our poc 
patches (this one and the 

[jira] [Comment Edited] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-04 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408657#comment-15408657
 ] 

Konstantinos Karanasos edited comment on YARN-4902 at 8/4/16 11:57 PM:
---

I am uploading a design document that describes our vision for scheduling 
long-running applications (LRA).
It is a very initial version, but I am sharing it, so that it helps drive the 
discussion.
There are overlapping bits with this JIRA (after all, up to a point, it targets 
the same problem), but there are clearly new points, especially when it comes 
to LRA planning.

As I had explained to [~leftnoteasy] offline during the Hadoop Summit, our 
focus is not on the scheduling given affinity/anti-affinity constraints, but on 
the LRA *planning*.
We did a first implementation of affinity, anti-affinity and *cardinality* 
constraints, because it was required for us to proceed with the LRA planning 
and nothing was available at that time.
[That said, we have already added support for cardinality and I think we have a 
different support for tags (but I need to take a closer look on YARN-1042) -- 
let's continue the discussion at that JIRA.]

Given that Wangda marked YARN-5468 as duplicate, do you believe that the LRA 
planing belongs to this or another existing JIRA?
As far as I can tell, it does not.
Let me know what you think, so that we can use the proper JIRAs and avoid 
duplicate effort going forward.

Thanks.



was (Author: kkaranasos):
I am uploading a design document that describes our vision for scheduling 
long-running applications (LRA).
It is a very initial version, but I am sharing it, so that it helps drive the 
discussion.
There are overlapping bits with this JIRA (after all, up to a point, it targets 
the same problem), but there are clearly new points, especially when it comes 
to LRA planning.

As I had explained to [~leftnoteasy] offline during the Hadoop Summit, our 
focus is not on the scheduling given affinity/anti-affinity constraints, but on 
the LRA *planning*.
We did a first implementation of affinity, anti-affinity and *cardinality* 
constraints, because it was required for us to proceed with the LRA planning 
and nothing was available at that time.
[That said, we have already added support for cardinality and I think we have a 
different support for tags (but I need to take a closer look on YARN-1042) -- 
let's continue the discussion at that JIRA.]

Given that Wangda marked YARN-5048 as duplicate, do you believe that the LRA 
planing belongs to this or another existing JIRA?
As far as I can tell, it does not.
Let me know what you think, so that we can use the proper JIRAs and avoid 
duplicate effort going forward.

Thanks.


> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf, LRA-scheduling-design.v0.pdf, YARN-5468.prototype.patch
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-04 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4902:
-
Attachment: LRA-scheduling-design.v0.pdf

I am uploading a design document that describes our vision for scheduling 
long-running applications (LRA).
It is a very initial version, but I am sharing it, so that it helps drive the 
discussion.
There are overlapping bits with this JIRA (after all, up to a point, it targets 
the same problem), but there are clearly new points, especially when it comes 
to LRA planning.

As I had explained to [~leftnoteasy] offline during the Hadoop Summit, our 
focus is not on the scheduling given affinity/anti-affinity constraints, but on 
the LRA *planning*.
We did a first implementation of affinity, anti-affinity and *cardinality* 
constraints, because it was required for us to proceed with the LRA planning 
and nothing was available at that time.
[That said, we have already added support for cardinality and I think we have a 
different support for tags (but I need to take a closer look on YARN-1042) -- 
let's continue the discussion at that JIRA.]

Given that Wangda marked YARN-5048 as duplicate, do you believe that the LRA 
planing belongs to this or another existing JIRA?
As far as I can tell, it does not.
Let me know what you think, so that we can use the proper JIRAs and avoid 
duplicate effort going forward.

Thanks.


> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf, LRA-scheduling-design.v0.pdf, YARN-5468.prototype.patch
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-08-04 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4902:
-
Attachment: YARN-5468.prototype.patch

We have been working on a first prototype for handling constraints in 
scheduling. 
Following [~leftnoteasy]'s recommendation, I am uploading it in this JIRA, as 
it seems the most related.
The patch is by [~pgaref].
This version of the prototype does not include our proposal about *planning* 
(rather than scheduling) of applications. We plan to update the patch with our 
proposal.

> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf, YARN-5468.prototype.patch
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



<    1   2   3   4   5   6   7   >