[jira] [Issue Comment Deleted] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf

2020-05-05 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10229:
-
Comment: was deleted

(was: Hi [~bibinchundatt] 

I feel handling it in ContainerManagerImpl#startContainers is good idea

{code:java}
  // Initialize the AMRMProxy service instance only if the container is 
of
  // type AM and if the AMRMProxy service is enabled
  if (amrmProxyEnabled && containerTokenIdentifier.getContainerType()
  .equals(ContainerType.APPLICATION_MASTER)) {
ApplicationId applicationID = containerId.getApplicationAttemptId()
.getApplicationId();
try {
  this.getAMRMProxyService().getFederationStateStoreFacade()
  .getApplicationHomeSubCluster(applicationID);
  this.getAMRMProxyService()
  .processApplicationStartRequest(request);
} catch (YarnException ex) {
  LOG.info("AM start request is sent to RM");
}
  }
{code}

Something like this. In this case no need to send initialize request to 
AMRMProxy
)

> [Federation] Client should be able to submit application to RM directly using 
> normal client conf
> 
>
> Key: YARN-10229
> URL: https://issues.apache.org/jira/browse/YARN-10229
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: amrmproxy, federation
>Affects Versions: 3.1.1
>Reporter: JohnsonGuo
>Assignee: Bilwa S T
>Priority: Major
>
> Scenario: When enable the yarn federation feature with multi yarn clusters, 
> one can submit their job to yarn-router by *modified* their client 
> configuration with yarn router address.
> But if one still wants to submit their jobs via the original client (before 
> enable federation) to RM directly, it will encounter the AMRMToken exception. 
>  That means once enable federation ,if some one want to submit job, they have 
> to  modify the client conf.
>  
> one possible solution for this Scenario is:
> In NodeManger, when the client ApplicationMaster request comes:
>  * get the client job.xml  from HDFS "".
>  * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml
>  * if the value of the parameter is "localhost:8049"(AMRM address),then do 
> the AMRMToken valid process
>  * if the value of the parameter is "rm:port"(rm address),then skip the 
> AMRMToken valid process
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10201) Make AMRMProxyPolicy aware of SC load

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100448#comment-17100448
 ] 

Hadoop QA commented on YARN-10201:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} prototool {color} | {color:blue}  0m  
0s{color} | {color:blue} prototool was not available. {color} |
| {color:blue}0{color} | {color:blue} markdownlint {color} | {color:blue}  0m  
0s{color} | {color:blue} markdownlint was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
25m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
28s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
16s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} 
branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site no findbugs output file 
(findbugsXml.xml) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
31s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} 
|
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
21s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  1m 21s{color} | 
{color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 21s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 31s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 13 new + 241 unchanged - 0 fixed = 254 total (was 241) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} 
|
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m 
15s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
10s{color} | {color:red} hadoop

[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition

2020-05-05 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100369#comment-17100369
 ] 

Jonathan Hung edited comment on YARN-6492 at 5/6/20, 1:15 AM:
--

OK thanks [~maniraj...@gmail.com] for the explanation. Sorry for the long 
delay, took some time to grok the latest 007 patch.
* Can we rename getPartitionQueueMetrics to something different? My initial 
confusion was that getPartitionQueueMetrics for QueueMetrics and 
PartitionQueueMetrics serve different purposes...the former for queue*partition 
and the latter for partition only. It's especially confusing in the case of 
PartitionQueueMetrics#getPartitionQueueMetrics, since this has nothing to do 
with queues. We can update the comment for 
PartitionQueueMetrics#getPartitionQueueMetrics as well, it also says Partition 
* Queue.
* Mentioned this earlier, can we remove the {noformat}   if (parent != null) {
  parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}
check in QueueMetrics#setAvailableResourcesToUser?  I think it should be 
addressed here rather than YARN-9767.
* I don't think the asserts in TestNodeLabelContainerAllocation should change. 
leafQueue.getMetrics should return metrics for default partition. I think we 
still need to check in QueueMetrics#setAvailableResourcesToUser and 
QueueMetrics#setAvailableResourcesToQueue whether partition is null or empty 
string. (This will break updating partition queue metrics, so we need to find a 
way to distinguish whether we're updating default partition queue metrics or 
partitioned queue metrics within the 
setAvailableResourcesToUser/setAvailableResourcesToQueue function.)
* Mentioned before, can we update everywhere we're creating a new metricName 
for partition/user/queue metrics to use a delimiter? e.g. {noformat}String 
metricName = partition + this.queueName + userName;{noformat}. Otherwise 
there's a chance that these metric names could collide.


was (Author: jhung):
OK thanks [~maniraj...@gmail.com] for the explanation. Sorry for the long 
delay, took some time to grok the latest 007 patch.
* Can we rename getPartitionQueueMetrics to something different? My initial 
confusion was that getPartitionQueueMetrics for QueueMetrics and 
PartitionQueueMetrics serve different purposes...the former for queue*partition 
and the latter for partition only. It's especially confusing in the case of 
PartitionQueueMetrics#getPartitionQueueMetrics, since this has nothing to do 
with queues. We can update the comment for 
PartitionQueueMetrics#getPartitionQueueMetrics as well, it also says Partition 
* Queue.
* Mentioned this earlier, can we remove the {noformat}   if (parent != null) {
  parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}
check in QueueMetrics#setAvailableResourcesToUser?  I think it should be 
addressed here rather than YARN-9767.
* I don't think the asserts in TestNodeLabelContainerAllocation should change. 
leafQueue.getMetrics should return metrics for default partition. I think we 
still need to check in QueueMetrics#setAvailableResourcesToUser and 
QueueMetrics#setAvailableResourcesToQueue whether partition is null or empty 
string. (This will break updating partition queue metrics, so we need to find a 
way to distinguish whether we're updating default partition queue metrics or 
partitioned queue metrics.)
* Mentioned before, can we update everywhere we're creating a new metricName 
for partition/user/queue metrics to use a delimiter? e.g. {noformat}String 
metricName = partition + this.queueName + userName;{noformat}. Otherwise 
there's a chance that these metric names could collide.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition

2020-05-05 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100369#comment-17100369
 ] 

Jonathan Hung edited comment on YARN-6492 at 5/6/20, 1:14 AM:
--

OK thanks [~maniraj...@gmail.com] for the explanation. Sorry for the long 
delay, took some time to grok the latest 007 patch.
* Can we rename getPartitionQueueMetrics to something different? My initial 
confusion was that getPartitionQueueMetrics for QueueMetrics and 
PartitionQueueMetrics serve different purposes...the former for queue*partition 
and the latter for partition only. It's especially confusing in the case of 
PartitionQueueMetrics#getPartitionQueueMetrics, since this has nothing to do 
with queues. We can update the comment for 
PartitionQueueMetrics#getPartitionQueueMetrics as well, it also says Partition 
* Queue.
* Mentioned this earlier, can we remove the {noformat}   if (parent != null) {
  parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}
check in QueueMetrics#setAvailableResourcesToUser?  I think it should be 
addressed here rather than YARN-9767.
* I don't think the asserts in TestNodeLabelContainerAllocation should change. 
leafQueue.getMetrics should return metrics for default partition. I think we 
still need to check in QueueMetrics#setAvailableResourcesToUser and 
QueueMetrics#setAvailableResourcesToQueue whether partition is null or empty 
string. (This will break updating partition queue metrics, so we need to find a 
way to distinguish whether we're updating default partition queue metrics or 
partitioned queue metrics.)
* Mentioned before, can we update everywhere we're creating a new metricName 
for partition/user/queue metrics to use a delimiter? e.g. {noformat}String 
metricName = partition + this.queueName + userName;{noformat}. Otherwise 
there's a chance that these metric names could collide.


was (Author: jhung):
OK thanks [~maniraj...@gmail.com] for the explanation. Sorry for the long 
delay, took some time to grok the latest 007 patch.
* Can we rename getPartitionQueueMetrics to something different? My initial 
confusion was that getPartitionQueueMetrics for QueueMetrics and 
PartitionQueueMetrics serve different purposes...the former for queue*partition 
and the latter for partition only. It's especially confusing in the case of 
PartitionQueueMetrics#getPartitionQueueMetrics, since this has nothing to do 
with queues. We can update the comment for 
PartitionQueueMetrics#getPartitionQueueMetrics as well, it also says Partition 
* Queue.
* Mentioned this earlier, can we remove the {noformat}   if (parent != null) {
  parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}
check in QueueMetrics#setAvailableResourcesToUser?  I think it should be 
addressed here rather than YARN-9767.
* I don't think the asserts in TestNodeLabelContainerAllocation should change. 
leafQueue.getMetrics should return metrics for default partition. I think we 
still need to check in QueueMetrics#setAvailableResourcesToUser and 
QueueMetrics#setAvailableResour cesToQueue whether partition is null or empty 
string. (This will break updating partition queue metrics, so we need to find a 
way to distinguish whether we're updating default partition queue metrics or 
partitioned queue metrics.)
* Mentioned before, can we update everywhere we're creating a new metricName 
for partition/user/queue metrics to use a delimiter? e.g. {noformat}String 
metricName = partition + this.queueName + userName;{noformat}. Otherwise 
there's a chance that these metric names could collide.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For addit

[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2020-05-05 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100369#comment-17100369
 ] 

Jonathan Hung commented on YARN-6492:
-

OK thanks [~maniraj...@gmail.com] for the explanation. Sorry for the long 
delay, took some time to grok the latest 007 patch.
* Can we rename getPartitionQueueMetrics to something different? My initial 
confusion was that getPartitionQueueMetrics for QueueMetrics and 
PartitionQueueMetrics serve different purposes...the former for queue*partition 
and the latter for partition only. It's especially confusing in the case of 
PartitionQueueMetrics#getPartitionQueueMetrics, since this has nothing to do 
with queues. We can update the comment for 
PartitionQueueMetrics#getPartitionQueueMetrics as well, it also says Partition 
* Queue.
* Mentioned this earlier, can we remove the {noformat}   if (parent != null) {
  parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}
check in QueueMetrics#setAvailableResourcesToUser?  I think it should be 
addressed here rather than YARN-9767.
* I don't think the asserts in TestNodeLabelContainerAllocation should change. 
leafQueue.getMetrics should return metrics for default partition. I think we 
still need to check in QueueMetrics#setAvailableResourcesToUser and 
QueueMetrics#setAvailableResour cesToQueue whether partition is null or empty 
string. (This will break updating partition queue metrics, so we need to find a 
way to distinguish whether we're updating default partition queue metrics or 
partitioned queue metrics.)
* Mentioned before, can we update everywhere we're creating a new metricName 
for partition/user/queue metrics to use a delimiter? e.g. {noformat}String 
metricName = partition + this.queueName + userName;{noformat}. Otherwise 
there's a chance that these metric names could collide.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6492) Generate queue metrics for each partition

2020-05-05 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-6492:
---

Assignee: Manikandan R  (was: Jonathan Hung)

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6492) Generate queue metrics for each partition

2020-05-05 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-6492:
---

Assignee: Jonathan Hung  (was: Manikandan R)

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10201) Make AMRMProxyPolicy aware of SC load

2020-05-05 Thread Young Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen updated YARN-10201:
--
Attachment: YARN-10201.v7.patch

> Make AMRMProxyPolicy aware of SC load
> -
>
> Key: YARN-10201
> URL: https://issues.apache.org/jira/browse/YARN-10201
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Young Chen
>Assignee: Young Chen
>Priority: Major
> Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch, 
> YARN-10201.v2.patch, YARN-10201.v3.patch, YARN-10201.v4.patch, 
> YARN-10201.v5.patch, YARN-10201.v6.patch, YARN-10201.v7.patch
>
>
> LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when 
> splitting resource requests. We propose changes to the policy so that it 
> receives feedback from SCs and can load balance requests across the federated 
> cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10252) Allow adjusting vCore weight in CPU cgroup strict mode

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100320#comment-17100320
 ] 

Hadoop QA commented on YARN-10252:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
28s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 217 unchanged - 0 fixed = 218 total (was 217) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 14m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
5s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
12s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 
44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  3m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/25988/artifact/

[jira] [Commented] (YARN-10252) Allow adjusting vCore weight in CPU cgroup strict mode

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100182#comment-17100182
 ] 

Hadoop QA commented on YARN-10252:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
31s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
58s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 216 unchanged - 0 fixed = 218 total (was 216) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 16m 
39s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
2s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 
47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
51s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 Server

[jira] [Commented] (YARN-8959) TestContainerResizing fails randomly

2020-05-05 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100138#comment-17100138
 ] 

Jonathan Turner Eagles commented on YARN-8959:
--

[~ahussein], thanks for taking up this issue. It's always nice to improve the 
stability of tests. I see you created a specialized waitFor condition to 
handler this scenario. Your description and analysis are very helpful in 
understanding this test failure. I can see that the dispatcher will await if it 
tries to take and the queue (LinkedBlockingQueue in this case) is empty, 
possibly returning early. However, in addition to the waitForThreadToWait 
condition, there is also DrainDispatcher.await(). This wait condition 
synchronously handles events and await will return when the event has 1) been 
taken from the queue and 2) been handled. This wait condition is the most 
popular in the code base, and seems sufficient to handle this condition.

Could waitForThreadToWait() be switched to await(). I think it will be easiest 
to understand as there is already a large precedent and readers of the code 
will be familiar. It will also be less specialized code to maintain in the code 
base.

Perhaps, I have missed something and await is insufficient. Let me know.

> TestContainerResizing fails randomly
> 
>
> Key: YARN-8959
> URL: https://issues.apache.org/jira/browse/YARN-8959
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: YARN-8959-branch-2.10.002.patch, 
> YARN-8959-branch-2.10.003.patch, YARN-8959.001.patch, YARN-8959.002.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer
> {code}
> testSimpleDecreaseContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.348 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<3072>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testSimpleDecreaseContainer(TestContainerResizing.java:210)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted
> {code}
> testIncreaseContainerUnreservedWhenContainerCompleted(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.445 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<7168>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1011)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenContainerCompleted(TestContainerResizing.java:729)
> {code}
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer
> {code}
> testExcessiveReservationWhenDecreaseSameContainer(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing)
>   Time elapsed: 0.321 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<2048>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.checkUsedResource(TestContainerResizing.java:1015)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testExcessiveReservationWhenDecreaseSameContainer(TestContainerResizing.java:623)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.Fram

[jira] [Commented] (YARN-10252) Allow adjusting vCore weight in CPU cgroup strict mode

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099963#comment-17099963
 ] 

Hadoop QA commented on YARN-10252:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
25m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m  
5s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 17 new + 216 unchanged - 0 fixed = 233 total (was 216) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 19m 
18s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
3s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
21s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 Serve

[jira] [Commented] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099929#comment-17099929
 ] 

Hudson commented on YARN-10257:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18218 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18218/])
YARN-10257. FS-CS converter: skip increment properties for mem/vcores (snemeth: 
rev cb6399c1095af52112cbf4356572d99923d694ae)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/converter/TestFSYarnSiteConverter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/converter/TestFSConfigToCSConfigConverterMain.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/converter/FSConfigToCSConfigConverter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/converter/FSYarnSiteConverter.java


> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10257:
--
Fix Version/s: 3.3.1

> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099915#comment-17099915
 ] 

Szilard Nemeth commented on YARN-10257:
---

Thanks [~pbacsko],
Committed this to trunk and branch-3.3.0.
Thanks [~adam.antal] for the review.
Resolving jira.

> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10257:
--
Fix Version/s: 3.4.0

> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement

2020-05-05 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099881#comment-17099881
 ] 

Prabhu Joseph commented on YARN-10259:
--

[~Tao Yang] Can you validate this issue and give some pointers in how to fix 
this issue. Thanks.

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement
> ---
>
> Key: YARN-10259
> URL: https://issues.apache.org/jira/browse/YARN-10259
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: REPRO_TEST.patch
>
>
> Reserved Containers are not allocated from the available space of other nodes 
> in CandidateNodeSet in MultiNodePlacement. 
> *Repro:*
> 1. MultiNode Placement Enabled.
> 2. Two nodes h1 and h2 with 8GB
> 3. Submit app1 AM (5GB) which gets placed in h1 and app2 AM (5GB) which gets 
> placed in h2.
> 4. Submit app3 AM which is reserved in h1
> 5. Kill app2 which frees space in h2.
> 6. app3 AM never gets ALLOCATED
> RM logs shows YARN-8127 fix rejecting the allocation proposal for app3 AM on 
> h2 as it expects the assignment to be on same node where reservation has 
> happened.
> {code}
> 2020-05-05 18:49:37,264 DEBUG [AsyncDispatcher event handler] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:commonReserve(573)) - Application attempt 
> appattempt_1588684773609_0003_01 reserved container 
> container_1588684773609_0003_01_01 on node host: h1:1234 #containers=1 
> available= used=. This attempt 
> currently has 1 reserved containers at priority 0; currentReservation 
> 
> 2020-05-05 18:49:37,264 INFO  [AsyncDispatcher event handler] 
> fica.FiCaSchedulerApp (FiCaSchedulerApp.java:apply(670)) - Reserved 
> container=container_1588684773609_0003_01_01, on node=host: h1:1234 
> #containers=1 available= used= 
> with resource=
>RESERVED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h1:1234; Resource=)]
>
> 2020-05-05 18:49:38,283 DEBUG [Time-limited test] 
> allocator.RegularContainerAllocator 
> (RegularContainerAllocator.java:assignContainer(514)) - assignContainers: 
> node=h2 application=application_1588684773609_0003 priority=0 
> pendingAsk=,repeat=1> 
> type=OFF_SWITCH
> 2020-05-05 18:49:38,285 DEBUG [Time-limited test] fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:commonCheckContainerAllocation(371)) - Try to allocate 
> from reserved container container_1588684773609_0003_01_01, but node is 
> not reserved
>ALLOCATED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h2:1234; Resource=)]
> {code}
> After reverting fix of YARN-8127, it works. Attached testcase which 
> reproduces the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement

2020-05-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10259:
-
Attachment: REPRO_TEST.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement
> ---
>
> Key: YARN-10259
> URL: https://issues.apache.org/jira/browse/YARN-10259
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: REPRO_TEST.patch
>
>
> Reserved Containers are not allocated from the available space of other nodes 
> in CandidateNodeSet in MultiNodePlacement. 
> *Repro:*
> 1. MultiNode Placement Enabled.
> 2. Two nodes h1 and h2 with 8GB
> 3. Submit app1 AM (5GB) which gets placed in h1 and app2 AM (5GB) which gets 
> placed in h2.
> 4. Submit app3 AM which is reserved in h1
> 5. Kill app2 which frees space in h2.
> 6. app3 AM never gets ALLOCATED
> RM logs shows YARN-8127 fix rejecting the allocation proposal for app3 AM on 
> h2 as it expects the assignment to be on same node where reservation has 
> happened.
> {code}
> 2020-05-05 18:49:37,264 DEBUG [AsyncDispatcher event handler] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:commonReserve(573)) - Application attempt 
> appattempt_1588684773609_0003_01 reserved container 
> container_1588684773609_0003_01_01 on node host: h1:1234 #containers=1 
> available= used=. This attempt 
> currently has 1 reserved containers at priority 0; currentReservation 
> 
> 2020-05-05 18:49:37,264 INFO  [AsyncDispatcher event handler] 
> fica.FiCaSchedulerApp (FiCaSchedulerApp.java:apply(670)) - Reserved 
> container=container_1588684773609_0003_01_01, on node=host: h1:1234 
> #containers=1 available= used= 
> with resource=
>RESERVED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h1:1234; Resource=)]
>
> 2020-05-05 18:49:38,283 DEBUG [Time-limited test] 
> allocator.RegularContainerAllocator 
> (RegularContainerAllocator.java:assignContainer(514)) - assignContainers: 
> node=h2 application=application_1588684773609_0003 priority=0 
> pendingAsk=,repeat=1> 
> type=OFF_SWITCH
> 2020-05-05 18:49:38,285 DEBUG [Time-limited test] fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:commonCheckContainerAllocation(371)) - Try to allocate 
> from reserved container container_1588684773609_0003_01_01, but node is 
> not reserved
>ALLOCATED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h2:1234; Resource=)]
> {code}
> After reverting fix of YARN-8127, it works. Attached testcase which 
> reproduces the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement

2020-05-05 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10259:


 Summary: Reserved Containers not allocated from available space of 
other nodes in CandidateNodeSet in MultiNodePlacement
 Key: YARN-10259
 URL: https://issues.apache.org/jira/browse/YARN-10259
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.2.0, 3.3.0
Reporter: Prabhu Joseph


Reserved Containers are not allocated from the available space of other nodes 
in CandidateNodeSet in MultiNodePlacement. 

*Repro:*

1. MultiNode Placement Enabled.
2. Two nodes h1 and h2 with 8GB
3. Submit app1 AM (5GB) which gets placed in h1 and app2 AM (5GB) which gets 
placed in h2.
4. Submit app3 AM which is reserved in h1
5. Kill app2 which frees space in h2.
6. app3 AM never gets ALLOCATED

RM logs shows YARN-8127 fix rejecting the allocation proposal for app3 AM on h2 
as it expects the assignment to be on same node where reservation has happened.

{code}
2020-05-05 18:49:37,264 DEBUG [AsyncDispatcher event handler] 
scheduler.SchedulerApplicationAttempt 
(SchedulerApplicationAttempt.java:commonReserve(573)) - Application attempt 
appattempt_1588684773609_0003_01 reserved container 
container_1588684773609_0003_01_01 on node host: h1:1234 #containers=1 
available= used=. This attempt 
currently has 1 reserved containers at priority 0; currentReservation 


2020-05-05 18:49:37,264 INFO  [AsyncDispatcher event handler] 
fica.FiCaSchedulerApp (FiCaSchedulerApp.java:apply(670)) - Reserved 
container=container_1588684773609_0003_01_01, on node=host: h1:1234 
#containers=1 available= used= 
with resource=
 RESERVED=[(Application=appattempt_1588684773609_0003_01; 
Node=h1:1234; Resource=)]
 
2020-05-05 18:49:38,283 DEBUG [Time-limited test] 
allocator.RegularContainerAllocator 
(RegularContainerAllocator.java:assignContainer(514)) - assignContainers: 
node=h2 application=application_1588684773609_0003 priority=0 
pendingAsk=,repeat=1> 
type=OFF_SWITCH

2020-05-05 18:49:38,285 DEBUG [Time-limited test] fica.FiCaSchedulerApp 
(FiCaSchedulerApp.java:commonCheckContainerAllocation(371)) - Try to allocate 
from reserved container container_1588684773609_0003_01_01, but node is not 
reserved
 ALLOCATED=[(Application=appattempt_1588684773609_0003_01; 
Node=h2:1234; Resource=)]
{code}

After reverting fix of YARN-8127, it works. Attached testcase which reproduces 
the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099854#comment-17099854
 ] 

Hadoop QA commented on YARN-10257:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
41s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/25985/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10257 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13002063/YARN-10257-002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient xml findbugs checkstyle |
| uname | Linux d4c6461b53b6 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven

[jira] [Updated] (YARN-10252) Allow adjusting vCore weight in CPU cgroup strict mode

2020-05-05 Thread Zbigniew Baranowski (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Baranowski updated YARN-10252:
---
Attachment: (was: YARN.patch)

> Allow adjusting vCore weight in CPU cgroup strict mode
> --
>
> Key: YARN-10252
> URL: https://issues.apache.org/jira/browse/YARN-10252
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.2.1
>Reporter: Zbigniew Baranowski
>Priority: Major
> Attachments: YARN.patch
>
>
> Currently, with CPU cgroup strict mode enabled on NodeManager, when cpu 
> resources are overcommitted ( 8 vCores on 4 core machine), the total amount 
> of CPU time that container will get for each requested vCore will be 
> automatically downscaled with the formula: vCoreCPUTime = 
> totalPhysicalCoresOnNM / coresConfiguredForNM. So container speed will be 
> throttled on CPU even if there are spare cores available on NM (e.g with 8 
> vCores available o 4 core machine, a container that asked for 2 cores 
> effectively will be allowed to use only on physical core). The same is 
> happening if CPU resource cap is enabled (via 
> yarn.nodemanager.resource.percentage-physical-cpu-limit), in this case, 
> totalCoresOnNode (=coresOnNode * percentage-physical-cpu-limit) is scaled 
> down by the cap. So for example, if the cap is 80%, a container that asked 
> for 2 cores will be allowed to use the max of the equivalent of 1.6 physical 
> core, regardless of the current NM load.
> Both aforementioned situations may lead to underuse of available resources. 
> In some cases, administrator may want to overcommit the resources if 
> applications are statically over-allocating resources, but not fully using 
> them. This will cause all containers to slow down, which is not the initial 
> intention. 
> Therefore it would be very useful if administrators have control on how 
> vCores are mapped to CPU time on NodeManagers in strict mode when CPU 
> resources are overcommitted or/and physical-cpu-limit is enabled.
> This could be potentially done with a parameter like 
> yarn.nodemanager.resource.strict-vcore-weight that controls the vCore to 
> pCore time mapping. E.g value 1 means one to one mapping, 1.2 means that a 
> single vcore can have up to 120% of a physical core (this can be handy for 
> pysparkers), -1 (default) disables the feature - use auto-scaling.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099802#comment-17099802
 ] 

Hadoop QA commented on YARN-10246:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
24s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
22s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 41s{color} | {color:orange} root: The patch generated 2 new + 75 unchanged - 
0 fixed = 77 total (was 75) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 35s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
57s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
55s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
|   | hadoop.io.compress.TestCompressorDecompressor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https:/

[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-05-05 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099803#comment-17099803
 ] 

Hudson commented on YARN-10160:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18217 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18217/])
YARN-10160. Add auto queue creation related configs to (snemeth: rev 
0debe55d6cf36b358c86c27d43991aa44baef4f2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/WebServicesTestUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerQueueInfo.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/LeafQueueTemplateInfo.java


> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch, 
> YARN-10160-007.patch, YARN-10160-008.patch, YARN-10160-009.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-05-05 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10160:
--
Fix Version/s: 3.4.0

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch, 
> YARN-10160-007.patch, YARN-10160-008.patch, YARN-10160-009.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099794#comment-17099794
 ] 

Szilard Nemeth commented on YARN-10160:
---

[~prabhujoseph],
While cherry-picking to branch-3.3.0, I faced some conflicts.
Could you please check those and upload a patch to that branch?

Thanks.

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch, 
> YARN-10160-007.patch, YARN-10160-008.patch, YARN-10160-009.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-05-05 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099792#comment-17099792
 ] 

Prabhu Joseph commented on YARN-10160:
--

Thanks [~snemeth].

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch, 
> YARN-10160-007.patch, YARN-10160-008.patch, YARN-10160-009.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099786#comment-17099786
 ] 

Szilard Nemeth commented on YARN-10160:
---

Hi [~prabhujoseph],
Latest patch LGTM, committed to trunk.
Thanks [~adam.antal] for the review.

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch, 
> YARN-10160-007.patch, YARN-10160-008.patch, YARN-10160-009.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099779#comment-17099779
 ] 

Szilard Nemeth edited comment on YARN-9460 at 5/5/20, 11:07 AM:


Hi [~BilwaST],
Oh, I see what is your question now.
I think we can do initialize the appropriate type of queueAclsManager based on 
instanceof checks.
Perhaps, a better way to do it is to introduce a SchedulerType enum with 3 
values: fair, capacity, fifo (if we don't have a similar thing in the codebase 
currently). 
Anyway, this could complicate a code a bit but at least it's a clean way to 
solve the problem so I'll let you decide what is the better way.


was (Author: snemeth):
Hi [~BilwaST],
Oh, I see what is your question now.
I think we can do initialize the appropriate type of queueAclsManager based on 
instanceof checks.
Perhaps, a better way to do it is to introduce a SchedulerType enum with 3 
values: fair, capacity, fifo (if we don't have a similar thing in the codebase 
currently). 
Anyway, this could complicate a code a bit more so I'll let you decide what is 
the better way.

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099779#comment-17099779
 ] 

Szilard Nemeth commented on YARN-9460:
--

Hi [~BilwaST],
Oh, I see what is your question now.
I think we can do initialize the appropriate type of queueAclsManager based on 
instanceof checks.
Perhaps, a better way to do it is to introduce a SchedulerType enum with 3 
values: fair, capacity, fifo (if we don't have a similar thing in the codebase 
currently). 
Anyway, this could complicate a code a bit more so I'll let you decide what is 
the better way.

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099776#comment-17099776
 ] 

Szilard Nemeth commented on YARN-10257:
---

Hi [~pbacsko],
LGTM, one nit only:
In 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.TestFSConfigToCSConfigConverterMain#testConvertFSConfigurationWithLongSwitches
There's this commented line: 

{code:java}
//verifyExitCode();
{code}

I don't need you to upload a patch for this so I can remove this line while 
committing as well, if you agree.
Waiting for green jenkins first.


> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-05-05 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099773#comment-17099773
 ] 

Bilwa S T commented on YARN-9460:
-

Hi [~snemeth] 

Thanks for the clarification. This is clear from Jira description itself. My 
doubt was about initializing queueAclsManager. Do we initialize based on 
scheduler type ? like below
{code:java}
public static String getDefaultReservationSystem(
  ResourceScheduler scheduler) {
if (scheduler instanceof CapacityScheduler) {
  return CapacityReservationSystem.class.getName();
} else if (scheduler instanceof FairScheduler) {
  return FairReservationSystem.class.getName();
}
return null;
  }
{code}

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099767#comment-17099767
 ] 

Peter Bacsko commented on YARN-10257:
-

[~adam.antal]

1. "Do you plan to do something with regards to the first item from the 
description? (I mean the increment-allocation properties)."

Well, those properties are not needed during the conversion - so I simply 
removed all references to them (the original FS-CS document made me believe 
that those are necessary to convert, but in fact, they're not).

2. system-rules artifact --> thanks, it looks like a better solution, I 
switched to this.

> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099760#comment-17099760
 ] 

Szilard Nemeth commented on YARN-9460:
--

Hi [~BilwaST],
One example of using instanceof is here: 
https://github.com/apache/hadoop/blob/9c8236d04dfc3d4cefe7a00b63625f60ee232cfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java#L61
Sometimes, instanceof checks might suggest a bad code design.
My idea is to create 2 versions from QueueACLsManager: One for FairScheduler 
and one for CapacityScheduler.
This way, checkAccess methods can be implemented differently so the instanceof 
checks could be replaced properly with Object oriented design.
Hope this makes sense.
If not, please feel free to ask follow up questions.

Thanks.

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8631) YARN RM fails to add the application to the delegation token renewer on recovery

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099756#comment-17099756
 ] 

Szilard Nemeth edited comment on YARN-8631 at 5/5/20, 10:24 AM:


Hi [~umittal],
I can't see the junit test attached, just the log and the 001 patch that 
contains the production changes.
It's okay to attach the unit tests as part of YARN-8631.002.patch file.


was (Author: snemeth):
Hi [~umittal],
I can't see the junit test attached, just the log and the 001 patch that 
contains the production changes.

> YARN RM fails to add the application to the delegation token renewer on 
> recovery
> 
>
> Key: YARN-8631
> URL: https://issues.apache.org/jira/browse/YARN-8631
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Sanjay Divgi
>Assignee: Umesh Mittal
>Priority: Blocker
> Attachments: YARN-8631.001.patch, 
> hadoop-yarn-resourcemanager-ctr-e138-1518143905142-429059-01-04.log
>
>
> On HA cluster we have observed that yarn resource manager fails to add the 
> application to the delegation token renewer on recovery.
> Below is the error:
> {code:java}
> 2018-08-07 08:41:23,850 INFO security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:renewToken(635)) - Renewed delegation-token= 
> [Kind: TIMELINE_DELEGATION_TOKEN, Service: 172.27.84.192:8188, Ident: 
> (TIMELINE_DELEGATION_TOKEN owner=hrt_qa_hive_spark, renewer=yarn, realUser=, 
> issueDate=1533624642302, maxDate=1534229442302, sequenceNumber=18, 
> masterKeyId=4);exp=1533717683478; apps=[application_1533623972681_0001]]
> 2018-08-07 08:41:23,855 WARN security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppRecoverEvent(955)) - Unable to 
> add the application to the delegation token renewer on recovery.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:522)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleDTRenewerAppRecoverEvent(DelegationTokenRenewer.java:953)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:79)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:912)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8631) YARN RM fails to add the application to the delegation token renewer on recovery

2020-05-05 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099756#comment-17099756
 ] 

Szilard Nemeth commented on YARN-8631:
--

Hi [~umittal],
I can't see the junit test attached, just the log and the 001 patch that 
contains the production changes.

> YARN RM fails to add the application to the delegation token renewer on 
> recovery
> 
>
> Key: YARN-8631
> URL: https://issues.apache.org/jira/browse/YARN-8631
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Sanjay Divgi
>Assignee: Umesh Mittal
>Priority: Blocker
> Attachments: YARN-8631.001.patch, 
> hadoop-yarn-resourcemanager-ctr-e138-1518143905142-429059-01-04.log
>
>
> On HA cluster we have observed that yarn resource manager fails to add the 
> application to the delegation token renewer on recovery.
> Below is the error:
> {code:java}
> 2018-08-07 08:41:23,850 INFO security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:renewToken(635)) - Renewed delegation-token= 
> [Kind: TIMELINE_DELEGATION_TOKEN, Service: 172.27.84.192:8188, Ident: 
> (TIMELINE_DELEGATION_TOKEN owner=hrt_qa_hive_spark, renewer=yarn, realUser=, 
> issueDate=1533624642302, maxDate=1534229442302, sequenceNumber=18, 
> masterKeyId=4);exp=1533717683478; apps=[application_1533623972681_0001]]
> 2018-08-07 08:41:23,855 WARN security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppRecoverEvent(955)) - Unable to 
> add the application to the delegation token renewer on recovery.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:522)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleDTRenewerAppRecoverEvent(DelegationTokenRenewer.java:953)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:79)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:912)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10257) FS-CS converter: skip increment properties for mem/vcores and fix DRF check

2020-05-05 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10257:

Attachment: YARN-10257-002.patch

> FS-CS converter: skip increment properties for mem/vcores and fix DRF check
> ---
>
> Key: YARN-10257
> URL: https://issues.apache.org/jira/browse/YARN-10257
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10257-001.patch, YARN-10257-002.patch
>
>
> Two issues have been discovered during fs2cs testing:
> 1. The conversion of allocation increment properties are not needed:
> {{yarn.scheduler.increment-allocation-mb}}
> {{yarn.scheduler.increment-allocation-vcores}}
> {{yarn.resource-types.memory-mb.increment-allocation}}
> {{yarn.resource-types.vcores.increment-allocation}}
> 2. The following piece of code is incorrect - the default scheduling policy 
> can be different from DRF, which is a problem if DRF is used everywhere else:
> {code}
>   private boolean isDrfUsed(FairScheduler fs) {
> FSQueue rootQueue = fs.getQueueManager().getRootQueue();
> AllocationConfiguration allocConf = fs.getAllocationConfiguration();
> String defaultPolicy = allocConf.getDefaultSchedulingPolicy().getName();
> if (DominantResourceFairnessPolicy.NAME.equals(defaultPolicy)) {
>   return true;
> } else {
>   return isDrfUsedOnQueueLevel(rootQueue);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099735#comment-17099735
 ] 

Bibin Chundatt edited comment on YARN-10246 at 5/5/20, 9:48 AM:


[~dmmkr] 

In case of non secure cluster this could work with a different property names.. 
But i am not sure how this could work in secure cluster.
Does curator support multiple kerboros configuration  in same process (RM is 
the process here.) RM has to connect to Federation state store and also RM 
State Store..

IIRC the version of curator doesnt support the same. 


was (Author: bibinchundatt):
[~dmmkr] 

In case of non secure cluster this could work with a different configuration 
file.. But i am not sure how this could work in secure cluster.
Does curator support multiple kerboros configuration  in same process (RM is 
the process here.) RM has to connect to Federation state store and also RM 
State Store..

IIRC the version of curator doesnt support the same. 

> Enable YARN Router to have a dedicated Zookeeper
> 
>
> Key: YARN-10246
> URL: https://issues.apache.org/jira/browse/YARN-10246
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10246.001.patch, YARN-10246.002.patch
>
>
> Currently, we have a single parameter hadoop.zk.address for Router and 
> Resourcemanager, Due to this we need have FederationStateStore and 
> RMStateStore on the same Zookeeper instance. 
> With the above topology there can be a load on ZooKeeper, since all 
> subcluster RMs will write to single ZooKeeper.
> So, If we Introduce a new configuration such as hadoop.federation.zk.address 
> we can have FederationStateStore on a dedicated Zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099735#comment-17099735
 ] 

Bibin Chundatt commented on YARN-10246:
---

[~dmmkr] 

In case of non secure cluster this could work with a different configuration 
file.. But i am not sure how this could work in secure cluster.
Does curator support multiple kerboros configuration  in same process (RM is 
the process here.) RM has to connect to Federation state store and also RM 
State Store..

IIRC the version of curator doesnt support the same. 

> Enable YARN Router to have a dedicated Zookeeper
> 
>
> Key: YARN-10246
> URL: https://issues.apache.org/jira/browse/YARN-10246
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10246.001.patch, YARN-10246.002.patch
>
>
> Currently, we have a single parameter hadoop.zk.address for Router and 
> Resourcemanager, Due to this we need have FederationStateStore and 
> RMStateStore on the same Zookeeper instance. 
> With the above topology there can be a load on ZooKeeper, since all 
> subcluster RMs will write to single ZooKeeper.
> So, If we Introduce a new configuration such as hadoop.federation.zk.address 
> we can have FederationStateStore on a dedicated Zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099701#comment-17099701
 ] 

Hadoop QA commented on YARN-10246:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 39m  
2s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
56s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
51s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 43s{color} | {color:orange} root: The patch generated 2 new + 75 unchanged - 
0 fixed = 77 total (was 75) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 15s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
57s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
52s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}197m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.ftp.TestFTPFileSystem |
|   | hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
|   | hadoop.io.compress.TestCompressorDecompressor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Clien

[jira] [Updated] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread D M Murali Krishna Reddy (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

D M Murali Krishna Reddy updated YARN-10246:

Attachment: YARN-10246.002.patch

> Enable YARN Router to have a dedicated Zookeeper
> 
>
> Key: YARN-10246
> URL: https://issues.apache.org/jira/browse/YARN-10246
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10246.001.patch, YARN-10246.002.patch
>
>
> Currently, we have a single parameter hadoop.zk.address for Router and 
> Resourcemanager, Due to this we need have FederationStateStore and 
> RMStateStore on the same Zookeeper instance. 
> With the above topology there can be a load on ZooKeeper, since all 
> subcluster RMs will write to single ZooKeeper.
> So, If we Introduce a new configuration such as hadoop.federation.zk.address 
> we can have FederationStateStore on a dedicated Zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6526) Refactoring SQLFederationStateStore by avoiding to recreate a connection at every call

2020-05-05 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099662#comment-17099662
 ] 

Bilwa S T commented on YARN-6526:
-

cc [~inigoiri] [~brahmareddy]

> Refactoring SQLFederationStateStore by avoiding to recreate a connection at 
> every call
> --
>
> Key: YARN-6526
> URL: https://issues.apache.org/jira/browse/YARN-6526
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Giovanni Matteo Fumarola
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-6526.001.patch, YARN-6526.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10246) Enable Yarn Router to have a dedicated Zookeeper

2020-05-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099589#comment-17099589
 ] 

Íñigo Goiri commented on YARN-10246:


Keep in mind in the past we did HADOOP-14741 to unify the settings for ZK.

> Enable Yarn Router to have a dedicated Zookeeper
> 
>
> Key: YARN-10246
> URL: https://issues.apache.org/jira/browse/YARN-10246
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10246.001.patch
>
>
> Currently, we have a single parameter hadoop.zk.address for Router and 
> Resourcemanager, Due to this we need have FederationStateStore and 
> RMStateStore on the same Zookeeper instance. 
> With the above topology there can be a load on ZooKeeper, since all 
> subcluster RMs will write to single ZooKeeper.
> So, If we Introduce a new configuration such as hadoop.federation.zk.address 
> we can have FederationStateStore on a dedicated Zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10246) Enable YARN Router to have a dedicated Zookeeper

2020-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-10246:
---
Summary: Enable YARN Router to have a dedicated Zookeeper  (was: Enable 
Yarn Router to have a dedicated Zookeeper)

> Enable YARN Router to have a dedicated Zookeeper
> 
>
> Key: YARN-10246
> URL: https://issues.apache.org/jira/browse/YARN-10246
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10246.001.patch
>
>
> Currently, we have a single parameter hadoop.zk.address for Router and 
> Resourcemanager, Due to this we need have FederationStateStore and 
> RMStateStore on the same Zookeeper instance. 
> With the above topology there can be a load on ZooKeeper, since all 
> subcluster RMs will write to single ZooKeeper.
> So, If we Introduce a new configuration such as hadoop.federation.zk.address 
> we can have FederationStateStore on a dedicated Zookeeper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org