date:20171018

[jira] [Comment Edited] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-10-18 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210659#comment-16210659
 ] 

Varun Saxena edited comment on YARN-7346 at 10/19/17 6:44 AM:
--

I have created YARN-7055 branch for ATSv2 beta effort.
I will fork a branch off branch-2 once branch-2.9 is forked out off branch-2.

cc [~haibo.chen]


was (Author: varun_saxena):
I have created YARN-7055 branch for ATSv2 beta effort.
I will fork a branch off branch-2 once branch-2.9 is forked out off branch-2.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-10-18 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210659#comment-16210659
 ] 

Varun Saxena commented on YARN-7346:


I have created YARN-7055 branch for ATSv2 beta effort.
I will fork a branch off branch-2 once branch-2.9 is forked out off branch-2.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations

2017-10-18 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210615#comment-16210615
 ] 

Haibo Chen commented on YARN-4511:
--

Looks like the build has been failing due to 404 when downloading oracle jdk

> Common scheduler changes supporting scheduler-specific implementations
> --
>
> Key: YARN-4511
> URL: https://issues.apache.org/jira/browse/YARN-4511
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Haibo Chen
> Attachments: YARN-4511-YARN-1011.00.patch, 
> YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, 
> YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7353) Docker permitted volumes don't properly check for directories

2017-10-18 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210587#comment-16210587
 ] 

Varun Vasudev commented on YARN-7353:
-

Thanks for the patch [~ebadger]. A couple of tests still fail on Centos 7 due 
to /bin being symlinked to /usr/bin - test_normalize_mounts and 
test_add_rw_mounts.
Here are the changes I had made for YARN-7344 -
{noformat}
   TEST_F(TestDockerUtil, test_normalize_mounts) {
 const int entries = 4;
-const char *permitted_mounts[] = {"/home", "/usr", "/bin/ls", NULL};
-const char *expected[] = {"/home/", "/usr/", "/bin/ls", NULL};
+const char *permitted_mounts[] = {"/home", "/usr", "/usr/bin/yes", NULL};
+const char *expected[] = {"/home/", "/usr/", "/usr/bin/yes", NULL};
 char **ptr = static_cast(malloc(entries * sizeof(char *)));
 for (int i = 0; i < entries; ++i) {
   if (permitted_mounts[i] != NULL) {
@@ -659,22 +659,22 @@ namespace ContainerExecutor {
 const int buff_len = 1024;
 char buff[buff_len];
 int ret = 0;
-std::string container_executor_cfg_contents = "[docker]\n  
docker.allowed.rw-mounts=/usr,/var,/bin/ls,..\n  "
-  
"docker.allowed.ro-mounts=/bin/cat";
+std::string container_executor_cfg_contents = "[docker]\n  
docker.allowed.rw-mounts=/opt,/var,/usr/bin/yes,..\n  "
+  
"docker.allowed.ro-mounts=/usr/bin/cut";
 std::vector > file_cmd_vec;
 file_cmd_vec.push_back(std::make_pair(
 "[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/var:/var", "-v '/var:/var' "));
 file_cmd_vec.push_back(std::make_pair(
 "[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/var/:/var/", "-v '/var/:/var/' "));
 file_cmd_vec.push_back(std::make_pair(
-"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/usr:/usr", "-v '/usr:/usr' "));
+"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/opt:/opt", "-v '/opt:/opt' "));
 file_cmd_vec.push_back(std::make_pair(
-"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/usr/:/usr", "-v '/usr/:/usr' "));
+"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/opt/:/opt", "-v '/opt/:/opt' "));
 file_cmd_vec.push_back(std::make_pair(
-"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/bin/ls:/bin/ls", "-v '/bin/ls:/bin/ls' "));
+"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/usr/bin/yes:/usr/bin/yes", "-v '/usr/bin/yes:/usr/bin/yes' "));
 file_cmd_vec.push_back(std::make_pair(
-"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/usr/bin:/mydisk1,/var/log/:/mydisk2",
-"-v '/usr/bin:/mydisk1' -v '/var/log/:/mydisk2' "));
+"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/opt:/mydisk1,/var/log/:/mydisk2",
+"-v '/opt:/mydisk1' -v '/var/log/:/mydisk2' "));
 file_cmd_vec.push_back(std::make_pair(
 "[docker-command-execution]\n  docker-command=run\n", ""));
 write_container_executor_cfg(container_executor_cfg_contents);
@@ -708,7 +708,7 @@ namespace ContainerExecutor {
 "[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/home:/home",
 static_cast(INVALID_DOCKER_RW_MOUNT)));
 bad_file_cmds_vec.push_back(std::make_pair(
-"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/bin/cat:/bin/cat",
+"[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/usr/bin/cut:/usr/bin/cut",
 static_cast(INVALID_DOCKER_RW_MOUNT)));
 bad_file_cmds_vec.push_back(std::make_pair(
 "[docker-command-execution]\n  docker-command=run\n  
rw-mounts=/blah:/blah",
{noformat}

Can you incorporate them into your patch? Thanks!

> Docker permitted volumes don't properly check for directories
> -
>
> Key: YARN-7353
> URL: https://issues.apache.org/jira/browse/YARN-7353
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: YARN-7353.001.patch, YARN-7353.002.patch
>
>
> {noformat:title=docker-util.c:check_mount_permitted()}
> // directory check
> permitted_mount_len = strlen(permitted_mounts[i]);
> if (permitted_mount_len > 0
> && permitted_mounts[i][permitted_mount_len - 1] == '/') {
>   if (strncmp(normalized_path, permitted_mounts[i], permitted_mount_len) 
> == 0) {
> ret = 1;
> break;
>   }
> }
> {noformat}
> This code will treat "/home/" as a directory, but not "/home"
> {noformat}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
>

[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210576#comment-16210576
 ] 

Hadoop QA commented on YARN-4511:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
10s{color} | {color:red} Docker failed to build yetus/hadoop:71bbb86. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-4511 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892474/YARN-4511-YARN-1011.04.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18023/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Common scheduler changes supporting scheduler-specific implementations
> --
>
> Key: YARN-4511
> URL: https://issues.apache.org/jira/browse/YARN-4511
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Haibo Chen
> Attachments: YARN-4511-YARN-1011.00.patch, 
> YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, 
> YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations

2017-10-18 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210571#comment-16210571
 ] 

Haibo Chen commented on YARN-4511:
--

The findbug warnings are bogus in that all the increments/decrements of the 
volatile variable are protected by synchronized. Not sure what is going on with 
the build error that I could not reproduce locally. Will retrigger the job 
manually

> Common scheduler changes supporting scheduler-specific implementations
> --
>
> Key: YARN-4511
> URL: https://issues.apache.org/jira/browse/YARN-4511
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Haibo Chen
> Attachments: YARN-4511-YARN-1011.00.patch, 
> YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, 
> YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210537#comment-16210537
 ] 

Rohith Sharma K S commented on YARN-7289:
-

thanks [~miklos.szeg...@cloudera.com] for investigating this! I agree with your 
root cause that fails for FS and approach taken for test case fix. However code 
was explicitly setting for CS. Did you changed manually to FS in test before 
running?
Couple of comments on tests
# *setUp* method explicitly set CS queue. This can be removed. 
# *setUpCSQueue()* method is invoked for both CS and FS! This method can be 
skipped for FS which is not required!

I do see reason for increasing a timeout 120 seconds since same test case 
running for both scheduler. It make sense to me!

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch, 
> YARN-7289.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7363) ContainerLocalizer don't have a valid log4j config in case of Linux container executor

2017-10-18 Thread Yufei Gu (JIRA)

Yufei Gu created YARN-7363:
--

 Summary: ContainerLocalizer don't have a valid log4j config in 
case of Linux container executor
 Key: YARN-7363
 URL: https://issues.apache.org/jira/browse/YARN-7363
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.1.0
Reporter: Yufei Gu
Assignee: Yufei Gu


In case of Linux container executor, ContainerLocalizer run as a separated 
process. It doesn't access a valid log4j.properties when the application user 
is not in the "hadoop" group. The log4j.properties of node manager is in its 
classpath, but it isn't readable by users not in hadoop group due to the 
security concern. In that case, ContainerLocalizer doesn't have a valid log4j 
configuration, and normally no log output.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7326) Some issues in RegistryDNS

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210429#comment-16210429
 ] 

Hadoop QA commented on YARN-7326:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7326 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892961/YARN-7326.yarn-native-services.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18022/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Some issues in RegistryDNS
> --
>
> Key: YARN-7326
> URL: https://issues.apache.org/jira/browse/YARN-7326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Eric Yang
> Attachments: YARN-7326.yarn-native-services.001.patch
>
>
> [~aw] helped to identify these issues: 
> Now some general bad news, not related to this patch:
> Ran a few queries, but this one is a bit concerning:
> {code}
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 .
> ;; Warning: query response not set
> ; <<>> DiG 9.10.3-P4-Ubuntu <<>> @localhost -p 54 .
> ; (2 servers found)
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOTAUTH, id: 47794
> ;; flags: rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> ;; WARNING: recursion requested but not available
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#54(127.0.0.1)
> ;; WHEN: Thu Oct 12 16:04:54 PDT 2017
> ;; MSG SIZE  rcvd: 12
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 axfr .
> ;; Connection to ::1#54(::1) for . failed: connection refused.
> ;; communications error to 127.0.0.1#54: end of file
> root@ubuntu:/hadoop/logs# 
> {code}
> It looks like it effectively fails when asked about a root zone, which is bad.
> It's also kind of interesting in what it does and doesn't log. Probably 
> should be configured to rotate logs based on size not date.
> The real showstopper though: RegistryDNS basically eats a core. It is running 
> with 100% cpu utilization with and without jsvc. On my laptop, this is 
> triggering my fan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7326) Some issues in RegistryDNS

2017-10-18 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7326:

Attachment: YARN-7326.yarn-native-services.001.patch

- Added recursion support.
- Configure DNS server to lookup upstream.

> Some issues in RegistryDNS
> --
>
> Key: YARN-7326
> URL: https://issues.apache.org/jira/browse/YARN-7326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Eric Yang
> Attachments: YARN-7326.yarn-native-services.001.patch
>
>
> [~aw] helped to identify these issues: 
> Now some general bad news, not related to this patch:
> Ran a few queries, but this one is a bit concerning:
> {code}
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 .
> ;; Warning: query response not set
> ; <<>> DiG 9.10.3-P4-Ubuntu <<>> @localhost -p 54 .
> ; (2 servers found)
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOTAUTH, id: 47794
> ;; flags: rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> ;; WARNING: recursion requested but not available
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#54(127.0.0.1)
> ;; WHEN: Thu Oct 12 16:04:54 PDT 2017
> ;; MSG SIZE  rcvd: 12
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 axfr .
> ;; Connection to ::1#54(::1) for . failed: connection refused.
> ;; communications error to 127.0.0.1#54: end of file
> root@ubuntu:/hadoop/logs# 
> {code}
> It looks like it effectively fails when asked about a root zone, which is bad.
> It's also kind of interesting in what it does and doesn't log. Probably 
> should be configured to rotate logs based on size not date.
> The real showstopper though: RegistryDNS basically eats a core. It is running 
> with 100% cpu utilization with and without jsvc. On my laptop, this is 
> triggering my fan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7326) Some issues in RegistryDNS

2017-10-18 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-7326:
---

Assignee: Eric Yang  (was: Jian He)

> Some issues in RegistryDNS
> --
>
> Key: YARN-7326
> URL: https://issues.apache.org/jira/browse/YARN-7326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Eric Yang
>
> [~aw] helped to identify these issues: 
> Now some general bad news, not related to this patch:
> Ran a few queries, but this one is a bit concerning:
> {code}
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 .
> ;; Warning: query response not set
> ; <<>> DiG 9.10.3-P4-Ubuntu <<>> @localhost -p 54 .
> ; (2 servers found)
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOTAUTH, id: 47794
> ;; flags: rd ad; QUERY: 0, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
> ;; WARNING: recursion requested but not available
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#54(127.0.0.1)
> ;; WHEN: Thu Oct 12 16:04:54 PDT 2017
> ;; MSG SIZE  rcvd: 12
> root@ubuntu:/hadoop/logs# dig @localhost -p 54 axfr .
> ;; Connection to ::1#54(::1) for . failed: connection refused.
> ;; communications error to 127.0.0.1#54: end of file
> root@ubuntu:/hadoop/logs# 
> {code}
> It looks like it effectively fails when asked about a root zone, which is bad.
> It's also kind of interesting in what it does and doesn't log. Probably 
> should be configured to rotate logs based on size not date.
> The real showstopper though: RegistryDNS basically eats a core. It is running 
> with 100% cpu utilization with and without jsvc. On my laptop, this is 
> triggering my fan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7290) canContainerBePreempted can return true when it shouldn't

2017-10-18 Thread Steven Rand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210418#comment-16210418
 ] 

Steven Rand commented on YARN-7290:
---

Thanks [~templedf]. For what it's worth, I was able to repro this on a live 
cluster as well as in the test. I let one spark-shell use the entire cluster, 
and then started a second spark-shell. The second-spark shell was able to 
preempt all of the first one's containers, including the Application Master. 
After I applied the patch, the second spark-shell was only able to preempt half 
of the cluster's resources away from the first one.

> canContainerBePreempted can return true when it shouldn't
> -
>
> Key: YARN-7290
> URL: https://issues.apache.org/jira/browse/YARN-7290
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Steven Rand
>Assignee: Steven Rand
> Attachments: YARN-7290-failing-test.patch, YARN-7290.001.patch, 
> YARN-7290.002.patch
>
>
> In FSAppAttempt#canContainerBePreempted, we make sure that preempting the 
> given container would not put the app below its fair share:
> {code}
> // Check if the app's allocation will be over its fairshare even
> // after preempting this container
> Resource usageAfterPreemption = Resources.clone(getResourceUsage());
> // Subtract resources of containers already queued for preemption
> synchronized (preemptionVariablesLock) {
>   Resources.subtractFrom(usageAfterPreemption, resourcesToBePreempted);
> }
> // Subtract this container's allocation to compute usage after preemption
> Resources.subtractFrom(
> usageAfterPreemption, container.getAllocatedResource());
> return !isUsageBelowShare(usageAfterPreemption, getFairShare());
> {code}
> However, this only considers one container in isolation, and fails to 
> consider containers for the same app that we already added to 
> {{preemptableContainers}} in 
> FSPreemptionThread#identifyContainersToPreemptOnNode. Therefore we can have a 
> case where we preempt multiple containers from the same app, none of which by 
> itself puts the app below fair share, but which cumulatively do so.
> I've attached a patch with a test to show this behavior. The flow is:
> 1. Initially greedyApp runs in {{root.preemptable.child-1}} and is allocated 
> all the resources (8g and 8vcores)
> 2. Then starvingApp runs in {{root.preemptable.child-2}} and requests 2 
> containers, each of which is 3g and 3vcores in size. At this point both 
> greedyApp and starvingApp have a fair share of 4g (with DRF not in use).
> 3. For the first container requested by starvedApp, we (correctly) preempt 3 
> containers from greedyApp, each of which is 1g and 1vcore.
> 4. For the second container requested by starvedApp, we again (this time 
> incorrectly) preempt 3 containers from greedyApp. This puts greedyApp below 
> its fair share, but happens anyway because all six times that we call 
> {{return !isUsageBelowShare(usageAfterPreemption, getFairShare());}}, the 
> value of {{usageAfterPreemption}} is 7g and 7vcores (confirmed using 
> debugger).
> So in addition to accounting for {{resourcesToBePreempted}}, we also need to 
> account for containers that we're already planning on preempting in 
> FSPreemptionThread#identifyContainersToPreemptOnNode. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-10-18 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210347#comment-16210347
 ] 

Jian He commented on YARN-7217:
---

Is this patch not allowing flex while app is running ? 

> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch, 
> YARN-7217.yarn-native-services.002.patch, 
> YARN-7217.yarn-native-services.003.patch, 
> YARN-7217.yarn-native-services.004.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-10-18 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210340#comment-16210340
 ] 

Jian He commented on YARN-7217:
---

[~eyang],
Some assumption of the jira is not applicable now,  since the flex 
functionality is removed from the "/ws/v1/services/[service_name]" endpoint. 
Can you update the jira description accordingly and also provide a description 
of what the patch does ? 


> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch, 
> YARN-7217.yarn-native-services.002.patch, 
> YARN-7217.yarn-native-services.003.patch, 
> YARN-7217.yarn-native-services.004.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210338#comment-16210338
 ] 

Hadoop QA commented on YARN-7262:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
10s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7262 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892940/YARN-7262.003.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18021/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add a hierarchy into the ZKRMStateStore for delegation token znodes to 
> prevent jute buffer overflow
> ---
>
> Key: YARN-7262
> URL: https://issues.apache.org/jira/browse/YARN-7262
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-7262.001.patch, YARN-7262.002.patch, 
> YARN-7262.003.patch
>
>
> We've seen users who are running into a problem where the RM is storing so 
> many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those 
> znodes is higher than the jute buffer. This is fine during operations, but 
> becomes a problem on a fail over because the RM will try to read in all of 
> the token znodes (i.e. call {{getChildren}} on the parent znode).  This is 
> particularly bad because everything appears to be okay, but then if a 
> failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in 
> YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull 
> subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the 
> delegation token znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-10-18 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210281#comment-16210281
 ] 

Haibo Chen commented on YARN-7346:
--

[~rohithsharma], do we have a branch yet for the hbase 2.0 work? If not, I will 
be happy to create one if you can provide me with some guidance.
I am  not sure what the apache process is.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210295#comment-16210295
 ] 

Hadoop QA commented on YARN-7289:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
|   | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0de40f0 |
| JIRA Issue | YARN-7289 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892908/YARN-7289.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ad33f3813d92 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2523e1c |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/18016/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/18016/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server

[jira] [Updated] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow

2017-10-18 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-7262:

Attachment: YARN-7262.003.patch

{quote}I kinda want to suggest that you also use the HIERARCHIES directory just 
for consistency.{quote}
My main concern is that the znode path is already pretty long and this is going 
to make it even longer (and is also not necessary).  If I could, I'd get rid of 
the other one, but it's already been in a release so that would make things 
_really_ complicated.  I'm not sure we should repeat a decision with downsides 
and no clear upsides just for (internal) consistency; if it was external to the 
user, maybe.  


The 003 patch:
- Added Javadoc for the new property in {{YarnConfiguration}}
- All new and updates assert statements now have messages
- Cleaned up {{loadRMDelegationTokenState}}
-- Renamed incorrectly named variable
-- Misc simplification
-- Unknown node check now only allows for "1", "2", "3", and "4" under the 
token root node instead of anywhere
- Made {{TestZKRMStateStore#getDelegationTokenNode}} and other methods in 
{{TestZKRMStateStore}} private or package private where possible
- Removed unnecessary boxing for {{renewDate}}





{quote}And if most of the other properties jumped off a bridge, would you do 
it, too? {quote}
I'd definitely consider it - there's probably a good reason why they're all 
jumping off a bridge.  Maybe the bridge is collapsing or something?

> Add a hierarchy into the ZKRMStateStore for delegation token znodes to 
> prevent jute buffer overflow
> ---
>
> Key: YARN-7262
> URL: https://issues.apache.org/jira/browse/YARN-7262
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-7262.001.patch, YARN-7262.002.patch, 
> YARN-7262.003.patch
>
>
> We've seen users who are running into a problem where the RM is storing so 
> many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those 
> znodes is higher than the jute buffer. This is fine during operations, but 
> becomes a problem on a fail over because the RM will try to read in all of 
> the token znodes (i.e. call {{getChildren}} on the parent znode).  This is 
> particularly bad because everything appears to be okay, but then if a 
> failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in 
> YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull 
> subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the 
> delegation token znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7351) High CPU usage issue in RegistryDNS

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210306#comment-16210306
 ] 

Hadoop QA commented on YARN-7351:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7351 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892911/YARN-7351.yarn-native-services.03.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18020/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> High CPU usage issue in RegistryDNS
> ---
>
> Key: YARN-7351
> URL: https://issues.apache.org/jira/browse/YARN-7351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7351.yarn-native-services.01.patch, 
> YARN-7351.yarn-native-services.02.patch, 
> YARN-7351.yarn-native-services.03.patch
>
>
> Thanks [~aw] for finding this issue.
> The current RegistryDNS implementation is always running on high CPU and 
> pretty much eats one core. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-10-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210290#comment-16210290
 ] 

Ted Yu commented on YARN-7346:
--

bq. few bugs causing ATSv2 unit tests failure

Please surface the bug(s) if 2.0.0-alpha4-SNAPSHOT still has it.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-10-18 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210277#comment-16210277
 ] 

Haibo Chen commented on YARN-7346:
--

The patch I have attached in YARN-7213 uses hbase2.0-alpha3 which has a few 
bugs causing ATSv2 unit tests failure.
We'd need to wait for a hbase2.0-alpha4 release before we can work on an 
official patch.

In the meantime, I'll continue to work on the patch in progress with my local 
hbase 2.0-alpha4 build before the release comes out officially.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210289#comment-16210289
 ] 

Hadoop QA commented on YARN-7102:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m  
5s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} root: The patch generated 0 new + 105 unchanged - 7 
fixed = 105 total (was 112) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 58s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 41s{color} 
| {color:red} hadoop-sls in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.sls.TestSLSRunner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-7102 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892885/YARN-7102-branch-2.8.v9.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 286c4968d1fe 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.8 / 4ea3ae3 |
| Default Java | 1.7.0_151 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/18011/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |

[jira] [Commented] (YARN-7339) LocalityMulticastAMRMProxyPolicy should handle cancel request properly

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210247#comment-16210247
 ] 

Hadoop QA commented on YARN-7339:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
10s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7339 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892928/YARN-7339-v4.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18019/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> LocalityMulticastAMRMProxyPolicy should handle cancel request properly
> --
>
> Key: YARN-7339
> URL: https://issues.apache.org/jira/browse/YARN-7339
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-7339-v1.patch, YARN-7339-v2.patch, 
> YARN-7339-v3.patch, YARN-7339-v4.patch
>
>
> Currently inside AMRMProxy, LocalityMulticastAMRMProxyPolicy is not handling 
> and splitting cancel requests from AM properly: 
> # For node cancel request, we should not treat it as a localized resource 
> request. Otherwise it can lead to all weight zero issue when computing 
> localized resource weight. 
> # For ANY cancel, we should broadcast to all known subclusters, not just the 
> ones associated with localized resources. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-7362) Set assumption of capacity scheduler for TestClientRMService.testUpdateApplicationPriorityRequest() and testUpdatePriorityAndKillAppWithZeroClusterResource()

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved YARN-7362.
--
Resolution: Duplicate

> Set assumption of capacity scheduler for 
> TestClientRMService.testUpdateApplicationPriorityRequest() and 
> testUpdatePriorityAndKillAppWithZeroClusterResource()
> -
>
> Key: YARN-7362
> URL: https://issues.apache.org/jira/browse/YARN-7362
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7362) Set assumption of capacity scheduler for TestClientRMService.testUpdateApplicationPriorityRequest() and testUpdatePriorityAndKillAppWithZeroClusterResource()

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7362:
-
Summary: Set assumption of capacity scheduler for 
TestClientRMService.testUpdateApplicationPriorityRequest() and 
testUpdatePriorityAndKillAppWithZeroClusterResource()  (was: Explicitly set 
assumption of capacity scheduler for TestClientRMService. 
testUpdateApplicationPriorityRequest() and  
testUpdatePriorityAndKillAppWithZeroClusterResource())

> Set assumption of capacity scheduler for 
> TestClientRMService.testUpdateApplicationPriorityRequest() and 
> testUpdatePriorityAndKillAppWithZeroClusterResource()
> -
>
> Key: YARN-7362
> URL: https://issues.apache.org/jira/browse/YARN-7362
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7362) Set assumption of capacity scheduler for TestClientRMService.testUpdateApplicationPriorityRequest() and testUpdatePriorityAndKillAppWithZeroClusterResource()

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7362:
-
Affects Version/s: 3.0.0-alpha3

> Set assumption of capacity scheduler for 
> TestClientRMService.testUpdateApplicationPriorityRequest() and 
> testUpdatePriorityAndKillAppWithZeroClusterResource()
> -
>
> Key: YARN-7362
> URL: https://issues.apache.org/jira/browse/YARN-7362
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7362) Set assumption of capacity scheduler for TestClientRMService.testUpdateApplicationPriorityRequest() and testUpdatePriorityAndKillAppWithZeroClusterResource()

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7362:
-
Target Version/s: 3.0.0

> Set assumption of capacity scheduler for 
> TestClientRMService.testUpdateApplicationPriorityRequest() and 
> testUpdatePriorityAndKillAppWithZeroClusterResource()
> -
>
> Key: YARN-7362
> URL: https://issues.apache.org/jira/browse/YARN-7362
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7362) Explicitly set assumption of capacity scheduler for TestClientRMService. testUpdateApplicationPriorityRequest() and testUpdatePriorityAndKillAppWithZeroClusterResource()

2017-10-18 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7362:


 Summary: Explicitly set assumption of capacity scheduler for 
TestClientRMService. testUpdateApplicationPriorityRequest() and  
testUpdatePriorityAndKillAppWithZeroClusterResource()
 Key: YARN-7362
 URL: https://issues.apache.org/jira/browse/YARN-7362
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Haibo Chen
Assignee: Haibo Chen
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7339) LocalityMulticastAMRMProxyPolicy should handle cancel request properly

2017-10-18 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7339:
---
Attachment: YARN-7339-v4.patch

Retry v3 as v4...

> LocalityMulticastAMRMProxyPolicy should handle cancel request properly
> --
>
> Key: YARN-7339
> URL: https://issues.apache.org/jira/browse/YARN-7339
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-7339-v1.patch, YARN-7339-v2.patch, 
> YARN-7339-v3.patch, YARN-7339-v4.patch
>
>
> Currently inside AMRMProxy, LocalityMulticastAMRMProxyPolicy is not handling 
> and splitting cancel requests from AM properly: 
> # For node cancel request, we should not treat it as a localized resource 
> request. Otherwise it can lead to all weight zero issue when computing 
> localized resource weight. 
> # For ANY cancel, we should broadcast to all known subclusters, not just the 
> ones associated with localized resources. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7230) Document DockerContainerRuntime for branch-2.8 with proper scope and claim as an experimental feature

2017-10-18 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210222#comment-16210222
 ] 

Shane Kumpf commented on YARN-7230:
---

Thanks [~djp], [~templedf], and [~ebadger]! I've opened YARN-7361 to fix up the 
missing property and improve the warning for other branches.

> Document DockerContainerRuntime for branch-2.8 with proper scope and claim as 
> an experimental feature
> -
>
> Key: YARN-7230
> URL: https://issues.apache.org/jira/browse/YARN-7230
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.8.1
>Reporter: Junping Du
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: ready-to-commit
> Fix For: 2.8.2
>
> Attachments: YARN-7230.branch-2.8.001.patch, 
> YARN-7230.branch-2.8.002.patch, YARN-7230.branch-2.8.003.patch
>
>
> YARN-5258 is to document new feature for docker container runtime which 
> already get checked in trunk/branch-2. We need a similar one for branch-2.8. 
> However, given we missed several patches, we need to define narrowed scope of 
> these feature/improvements which match with existing patches landed in 2.8. 
> Also, like YARN-6622, to document it as experimental.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7361) Improve the docker container runtime documentation

2017-10-18 Thread Shane Kumpf (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-7361:
--
Summary: Improve the docker container runtime documentation  (was: Improve 
the docker container related documentation)

> Improve the docker container runtime documentation
> --
>
> Key: YARN-7361
> URL: https://issues.apache.org/jira/browse/YARN-7361
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>
> During review of YARN-7230, it was found that 
> yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker 
> containers documentation in most of the active branches. We can also improve 
> the warning that was introduced in YARN-6622.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7361) Improve the docker container related documentation

2017-10-18 Thread Shane Kumpf (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reassigned YARN-7361:
-

Assignee: Shane Kumpf

> Improve the docker container related documentation
> --
>
> Key: YARN-7361
> URL: https://issues.apache.org/jira/browse/YARN-7361
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>
> During review of YARN-7230, it was found that 
> yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker 
> containers documentation in most of the active branches. We can also improve 
> the warning that was introduced in YARN-6622.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7361) Improve the docker container related documentation

2017-10-18 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210217#comment-16210217
 ] 

Shane Kumpf commented on YARN-7361:
---

I can take this up as a follow on from YARN-7230.

> Improve the docker container related documentation
> --
>
> Key: YARN-7361
> URL: https://issues.apache.org/jira/browse/YARN-7361
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Shane Kumpf
>
> During review of YARN-7230, it was found that 
> yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker 
> containers documentation in most of the active branches. We can also improve 
> the warning that was introduced in YARN-6622.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7361) Improve the docker container related documentation

2017-10-18 Thread Shane Kumpf (JIRA)

Shane Kumpf created YARN-7361:
-

 Summary: Improve the docker container related documentation
 Key: YARN-7361
 URL: https://issues.apache.org/jira/browse/YARN-7361
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Shane Kumpf


During review of YARN-7230, it was found that 
yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker 
containers documentation in most of the active branches. We can also improve 
the warning that was introduced in YARN-6622.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7360) TestRM.testNMTokenSentForNormalContainer() fails with Fair Scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7360:
-
Attachment: YARN-7360.00.patch

> TestRM.testNMTokenSentForNormalContainer() fails with Fair Scheduler
> 
>
> Key: YARN-7360
> URL: https://issues.apache.org/jira/browse/YARN-7360
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-7360.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7360) TestRM.testNMTokenSentForNormalContainer() fails with Fair Scheduler

2017-10-18 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7360:


 Summary: TestRM.testNMTokenSentForNormalContainer() fails with 
Fair Scheduler
 Key: YARN-7360
 URL: https://issues.apache.org/jira/browse/YARN-7360
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0-alpha3
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7224) Support GPU isolation for docker container

2017-10-18 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210206#comment-16210206
 ] 

Wangda Tan commented on YARN-7224:
--

Just updated up-to-date approaches used in my patch to JIRA description. 

Overall workflow:

1) GpuDockerCommandPlugin (init method) talk to nvidia-docker-plugin REST end 
point to get options to launch docker command.

2) From the docker command, create docker volumes when needed. 
(DockerLinuxContainerRuntime#prepareContainer).

3) When launch docker container, GpuDockerCommandPlugin injects allowed GPU 
devices and mount volumes when GPU is required.

[~sunilg]/[~ebadger]/[~vvasudev]/[~shaneku...@gmail.com], could you take a look 
at the patch when you get chance?

Thanks.

> Support GPU isolation for docker container
> --
>
> Key: YARN-7224
> URL: https://issues.apache.org/jira/browse/YARN-7224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-7224.001.patch, YARN-7224.002-wip.patch, 
> YARN-7224.003.patch, YARN-7224.004.patch, YARN-7224.005.patch
>
>
> This patch is to address issues when docker container is being used:
> 1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are 
> pre-packaged inside docker image, it could conflict to driver and 
> nvidia-libraries installed on Host OS. An alternative solution is to detect 
> Host OS's installed drivers and devices, mount it when launch docker 
> container. Please refer to \[1\] for more details. 
> 2. Image detection: 
> From \[2\], the challenge is: 
> bq. Mounting user-level driver libraries and device files clobbers the 
> environment of the container, it should be done only when the container is 
> running a GPU application. The challenge here is to determine if a given 
> image will be using the GPU or not. We should also prevent launching 
> containers based on a Docker image that is incompatible with the host NVIDIA 
> driver version, you can find more details on this wiki page.
> 3. GPU isolation.
> *Proposed solution*:
> a. Use nvidia-docker-plugin \[3\] to address issue #1, this is the same 
> solution used by K8S \[4\]. issue #2 could be addressed in a separate JIRA.
> We won't ship nvidia-docker-plugin with out releases and we require cluster 
> admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. 
> "nvidia-docker" is a wrapper of docker binary which can address #3 as well, 
> however "nvidia-docker" doesn't provide same semantics of docker, and it 
> needs to setup additional environments such as PATH/LD_LIBRARY_PATH to use 
> it. To avoid introducing additional issues, we plan to use 
> nvidia-docker-plugin + docker binary approach.
> b. To address GPU driver and nvidia libraries, we uses nvidia-docker-plugin 
> \[3\] to create a volume which includes GPU-related libraries and mount it 
> when docker container being launched. Changes include: 
> - Instead of using {{volume-driver}}, this patch added {{docker volume 
> create}} command to c-e and NM Java side. The reason is {{volume-driver}} can 
> only use single volume driver for each launched docker container.
> - Updated {{c-e}} and Java side, if a mounted volume is a named volume in 
> docker, skip checking file existence. (Named-volume still need to be added to 
> permitted list of container-executor.cfg).
> c. To address isolation issue:
> We found that, cgroup + docker doesn't work under newer docker version which 
> uses {{runc}} as default runtime. Setting {{--cgroup-parent}} to a cgroup 
> which include any {{devices.deny}} causes docker container cannot be launched.
> Instead this patch passes allowed GPU devices via {{--device}} to docker 
> launch command.
> References:
> \[1\] https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver
> \[2\] https://github.com/NVIDIA/nvidia-docker/wiki/Image-inspection
> \[3\] https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker-plugin
> \[4\] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210203#comment-16210203
 ] 

Hadoop QA commented on YARN-7359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892924/YARN-7359.00.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18018/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
> Attachments: YARN-7359.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7224) Support GPU isolation for docker container

2017-10-18 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7224:
-
Description: 
This patch is to address issues when docker container is being used:
1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are 
pre-packaged inside docker image, it could conflict to driver and 
nvidia-libraries installed on Host OS. An alternative solution is to detect 
Host OS's installed drivers and devices, mount it when launch docker container. 
Please refer to \[1\] for more details. 

2. Image detection: 
>From \[2\], the challenge is: 
bq. Mounting user-level driver libraries and device files clobbers the 
environment of the container, it should be done only when the container is 
running a GPU application. The challenge here is to determine if a given image 
will be using the GPU or not. We should also prevent launching containers based 
on a Docker image that is incompatible with the host NVIDIA driver version, you 
can find more details on this wiki page.

3. GPU isolation.

*Proposed solution*:

a. Use nvidia-docker-plugin \[3\] to address issue #1, this is the same 
solution used by K8S \[4\]. issue #2 could be addressed in a separate JIRA.

We won't ship nvidia-docker-plugin with out releases and we require cluster 
admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. 
"nvidia-docker" is a wrapper of docker binary which can address #3 as well, 
however "nvidia-docker" doesn't provide same semantics of docker, and it needs 
to setup additional environments such as PATH/LD_LIBRARY_PATH to use it. To 
avoid introducing additional issues, we plan to use nvidia-docker-plugin + 
docker binary approach.

b. To address GPU driver and nvidia libraries, we uses nvidia-docker-plugin 
\[3\] to create a volume which includes GPU-related libraries and mount it when 
docker container being launched. Changes include: 

- Instead of using {{volume-driver}}, this patch added {{docker volume create}} 
command to c-e and NM Java side. The reason is {{volume-driver}} can only use 
single volume driver for each launched docker container.
- Updated {{c-e}} and Java side, if a mounted volume is a named volume in 
docker, skip checking file existence. (Named-volume still need to be added to 
permitted list of container-executor.cfg).

c. To address isolation issue:

We found that, cgroup + docker doesn't work under newer docker version which 
uses {{runc}} as default runtime. Setting {{--cgroup-parent}} to a cgroup which 
include any {{devices.deny}} causes docker container cannot be launched.

Instead this patch passes allowed GPU devices via {{--device}} to docker launch 
command.

References:

\[1\] https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver
\[2\] https://github.com/NVIDIA/nvidia-docker/wiki/Image-inspection
\[3\] https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker-plugin
\[4\] https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/

  was:YARN-6620 added support of GPU isolation in NM side, which only supports 
non-docker containers. We need to add support to help docker containers 
launched by YARN can utilize GPUs.


> Support GPU isolation for docker container
> --
>
> Key: YARN-7224
> URL: https://issues.apache.org/jira/browse/YARN-7224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-7224.001.patch, YARN-7224.002-wip.patch, 
> YARN-7224.003.patch, YARN-7224.004.patch, YARN-7224.005.patch
>
>
> This patch is to address issues when docker container is being used:
> 1. GPU driver and nvidia libraries: If GPU drivers and NV libraries are 
> pre-packaged inside docker image, it could conflict to driver and 
> nvidia-libraries installed on Host OS. An alternative solution is to detect 
> Host OS's installed drivers and devices, mount it when launch docker 
> container. Please refer to \[1\] for more details. 
> 2. Image detection: 
> From \[2\], the challenge is: 
> bq. Mounting user-level driver libraries and device files clobbers the 
> environment of the container, it should be done only when the container is 
> running a GPU application. The challenge here is to determine if a given 
> image will be using the GPU or not. We should also prevent launching 
> containers based on a Docker image that is incompatible with the host NVIDIA 
> driver version, you can find more details on this wiki page.
> 3. GPU isolation.
> *Proposed solution*:
> a. Use nvidia-docker-plugin \[3\] to address issue #1, this is the same 
> solution used by K8S \[4\]. issue #2 could be addressed in a separate JIRA.
> We won't ship nvidia-docker-plugin with out releases and we require cluster 
> admin to preinstall nvidia-docker-plugin to use GPU+docker support on YARN. 
> "nvidia-docker" is

[jira] [Updated] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7359:
-
Attachment: YARN-7359.00.patch

> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
> Attachments: YARN-7359.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7359:
-
Issue Type: Improvement  (was: Bug)

> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

2017-10-18 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210188#comment-16210188
 ] 

Wangda Tan commented on YARN-6623:
--

Attached patch LGTM, thanks [~vvasudev].

> Add support to turn off launching privileged containers in the 
> container-executor
> -
>
> Key: YARN-6623
> URL: https://issues.apache.org/jira/browse/YARN-6623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: YARN-6623-branch-2.013.patch, 
> YARN-6623-branch-2.014.patch, YARN-6623-branch-2.015.patch, 
> YARN-6623.001.patch, YARN-6623.002.patch, YARN-6623.003.patch, 
> YARN-6623.004.patch, YARN-6623.005.patch, YARN-6623.006.patch, 
> YARN-6623.007.patch, YARN-6623.008.patch, YARN-6623.009.patch, 
> YARN-6623.010.patch, YARN-6623.011.patch, YARN-6623.012.patch, 
> YARN-6623.013.patch, cetest.stderr, cetest.stdout
>
>
> Currently, launching privileged containers is controlled by the NM. We should 
> add a flag to the container-executor.cfg allowing admins to disable launching 
> privileged containers at the container-executor level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7359:
-
Priority: Minor  (was: Major)

> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7359:
-
Summary: TestAppManager.testQueueSubmitWithNoPermission() should be 
scheduler agnostic  (was: TestAppManager.testQueueSubmitWithNoPermission())

> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned YARN-7359:


Assignee: Haibo Chen

> TestAppManager.testQueueSubmitWithNoPermission() should be scheduler agnostic
> -
>
> Key: YARN-7359
> URL: https://issues.apache.org/jira/browse/YARN-7359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7359) TestAppManager.testQueueSubmitWithNoPermission()

2017-10-18 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7359:


 Summary: TestAppManager.testQueueSubmitWithNoPermission()
 Key: YARN-7359
 URL: https://issues.apache.org/jira/browse/YARN-7359
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6747) TestFSAppStarvation.testPreemptionEnable fails intermittently

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210133#comment-16210133
 ] 

Hadoop QA commented on YARN-6747:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6747 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892915/YARN-6747.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18017/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestFSAppStarvation.testPreemptionEnable fails intermittently
> -
>
> Key: YARN-6747
> URL: https://issues.apache.org/jira/browse/YARN-6747
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Miklos Szegedi
> Attachments: YARN-6747.000.patch, YARN-6747.001.patch
>
>
> *Error Message*
> Apps re-added even before starvation delay passed expected:<4> but was:<3>
> *Stacktrace*
> java.lang.AssertionError: Apps re-added even before starvation delay passed 
> expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Attachment: YARN-7358.00.patch

> TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly 
> set capacity scheduler
> ---
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
> Attachments: YARN-7358.00.patch
>
>
> TestZKConfigurationStore and TestLeveldbConfigurationStore both assume 
> capacity scheduler in RM. They should set it explicitly in configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7339) LocalityMulticastAMRMProxyPolicy should handle cancel request properly

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210130#comment-16210130
 ] 

Hadoop QA commented on YARN-7339:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  3m 
14s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7339 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892909/YARN-7339-v3.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18013/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> LocalityMulticastAMRMProxyPolicy should handle cancel request properly
> --
>
> Key: YARN-7339
> URL: https://issues.apache.org/jira/browse/YARN-7339
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-7339-v1.patch, YARN-7339-v2.patch, 
> YARN-7339-v3.patch
>
>
> Currently inside AMRMProxy, LocalityMulticastAMRMProxyPolicy is not handling 
> and splitting cancel requests from AM properly: 
> # For node cancel request, we should not treat it as a localized resource 
> request. Otherwise it can lead to all weight zero issue when computing 
> localized resource weight. 
> # For ANY cancel, we should broadcast to all known subclusters, not just the 
> ones associated with localized resources. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7355) TestDistributedShell should be scheduler agnostic

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210129#comment-16210129
 ] 

Hadoop QA commented on YARN-7355:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  3m 
13s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7355 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892910/YARN-7355.00.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18014/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestDistributedShell should be scheduler agnostic 
> --
>
> Key: YARN-7355
> URL: https://issues.apache.org/jira/browse/YARN-7355
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-7355.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6747) TestFSAppStarvation.testPreemptionEnable fails intermittently

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210128#comment-16210128
 ] 

Daniel Templeton commented on YARN-6747:


"above" has no context when it's in a Jenkins report.  Just sayin'.  Otherwise, 
the patch looks good to me.

> TestFSAppStarvation.testPreemptionEnable fails intermittently
> -
>
> Key: YARN-6747
> URL: https://issues.apache.org/jira/browse/YARN-6747
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Miklos Szegedi
> Attachments: YARN-6747.000.patch, YARN-6747.001.patch
>
>
> *Error Message*
> Apps re-added even before starvation delay passed expected:<4> but was:<3>
> *Stacktrace*
> java.lang.AssertionError: Apps re-added even before starvation delay passed 
> expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7351) High CPU usage issue in RegistryDNS

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210124#comment-16210124
 ] 

Hadoop QA commented on YARN-7351:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
33s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7351 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892911/YARN-7351.yarn-native-services.03.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18015/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> High CPU usage issue in RegistryDNS
> ---
>
> Key: YARN-7351
> URL: https://issues.apache.org/jira/browse/YARN-7351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7351.yarn-native-services.01.patch, 
> YARN-7351.yarn-native-services.02.patch, 
> YARN-7351.yarn-native-services.03.patch
>
>
> Thanks [~aw] for finding this issue.
> The current RegistryDNS implementation is always running on high CPU and 
> pretty much eats one core. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6927) Add support for individual resource types requests in MapReduce

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210123#comment-16210123
 ] 

Daniel Templeton commented on YARN-6927:


# In {{getCustomResourceTypePrefix()}}, the {{Optional}} is overkill.  You 
could just return null.
# In {{TaskAttemptImpl()}}, {{resourceInformation}} isn't used, which means the 
resource types are ignored, and we only use the old style properties.
# It looks like we're now just ignoring the am_prefix.mb and 
am_prefix.cpu-vcores properties.  We should only ignore them if there's a 
conflict.

> Add support for individual resource types requests in MapReduce
> ---
>
> Key: YARN-6927
> URL: https://issues.apache.org/jira/browse/YARN-6927
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Daniel Templeton
>Assignee: Gergo Repas
> Attachments: YARN-6927.000.patch, YARN-6927.001.patch, 
> YARN-6927.002.patch
>
>
> YARN-6504 adds support for resource profiles in MapReduce jobs, but resource 
> profiles don't give users much flexibility in their resource requests.  To 
> satisfy users' needs, MapReduce should also allow users to specify arbitrary 
> resource requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6747) TestFSAppStarvation.testPreemptionEnable fails intermittently

2017-10-18 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-6747:
-
Attachment: YARN-6747.001.patch

Thank you, [~templedf]. I tried to be even more specific and updated the jira a 
new patch.

> TestFSAppStarvation.testPreemptionEnable fails intermittently
> -
>
> Key: YARN-6747
> URL: https://issues.apache.org/jira/browse/YARN-6747
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Miklos Szegedi
> Attachments: YARN-6747.000.patch, YARN-6747.001.patch
>
>
> *Error Message*
> Apps re-added even before starvation delay passed expected:<4> but was:<3>
> *Stacktrace*
> java.lang.AssertionError: Apps re-added even before starvation delay passed 
> expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7230) Document DockerContainerRuntime for branch-2.8 with proper scope and claim as an experimental feature

2017-10-18 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-7230:
-
Fix Version/s: 2.8.2

> Document DockerContainerRuntime for branch-2.8 with proper scope and claim as 
> an experimental feature
> -
>
> Key: YARN-7230
> URL: https://issues.apache.org/jira/browse/YARN-7230
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.8.1
>Reporter: Junping Du
>Assignee: Shane Kumpf
>Priority: Blocker
>  Labels: ready-to-commit
> Fix For: 2.8.2
>
> Attachments: YARN-7230.branch-2.8.001.patch, 
> YARN-7230.branch-2.8.002.patch, YARN-7230.branch-2.8.003.patch
>
>
> YARN-5258 is to document new feature for docker container runtime which 
> already get checked in trunk/branch-2. We need a similar one for branch-2.8. 
> However, given we missed several patches, we need to define narrowed scope of 
> these feature/improvements which match with existing patches landed in 2.8. 
> Also, like YARN-6622, to document it as experimental.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7351) High CPU usage issue in RegistryDNS

2017-10-18 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7351:
--
Attachment: YARN-7351.yarn-native-services.03.patch

Uploaded a new patch

> High CPU usage issue in RegistryDNS
> ---
>
> Key: YARN-7351
> URL: https://issues.apache.org/jira/browse/YARN-7351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7351.yarn-native-services.01.patch, 
> YARN-7351.yarn-native-services.02.patch, 
> YARN-7351.yarn-native-services.03.patch
>
>
> Thanks [~aw] for finding this issue.
> The current RegistryDNS implementation is always running on high CPU and 
> pretty much eats one core. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7355) TestDistributedShell should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7355:
-
Attachment: YARN-7355.00.patch

> TestDistributedShell should be scheduler agnostic 
> --
>
> Key: YARN-7355
> URL: https://issues.apache.org/jira/browse/YARN-7355
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: YARN-7355.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210097#comment-16210097
 ] 

Daniel Templeton commented on YARN-7289:


My question was how fair scheduler was getting involved at all.  In the 
original code I see lots of things hard-coded to capacity scheduler and no 
mention of fair scheduler.

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch, 
> YARN-7289.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7339) LocalityMulticastAMRMProxyPolicy should handle cancel request properly

2017-10-18 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7339:
---
Attachment: YARN-7339-v3.patch

> LocalityMulticastAMRMProxyPolicy should handle cancel request properly
> --
>
> Key: YARN-7339
> URL: https://issues.apache.org/jira/browse/YARN-7339
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-7339-v1.patch, YARN-7339-v2.patch, 
> YARN-7339-v3.patch
>
>
> Currently inside AMRMProxy, LocalityMulticastAMRMProxyPolicy is not handling 
> and splitting cancel requests from AM properly: 
> # For node cancel request, we should not treat it as a localized resource 
> request. Otherwise it can lead to all weight zero issue when computing 
> localized resource weight. 
> # For ANY cancel, we should broadcast to all known subclusters, not just the 
> ones associated with localized resources. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-7289:
-
Attachment: YARN-7289.002.patch

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch, 
> YARN-7289.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210092#comment-16210092
 ] 

Daniel Templeton commented on YARN-7289:


Not a big deal, but {{testApplicationLifetimeMonitorWorker()}} would more 
usually be named {{doTestApplicationLifetimeMonitor()}} or 
{{testApplicationLifetimeMonitorInternal()}}.

Given that you're only testing two schedulers, it might be clearer in 
{{testApplicationLifetimeMonitor()}} to just call 
{{testApplicationLifetimeMonitorWorker()}} twice instead of building the list 
and iterating over it.

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6747) TestFSAppStarvation.testPreemptionEnable fails intermittently

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210074#comment-16210074
 ] 

Daniel Templeton edited comment on YARN-6747 at 10/18/17 9:36 PM:
--

"Each app is marked as starved exactly once" seems unclear.  How about, 
"Expected each starved app to be marked as starved exactly once during each of 
the two starvation checks, but the number of starved apps registered is not 
twice the number of starved apps." :)


was (Author: templedf):
Not a big deal, but {{testApplicationLifetimeMonitorWorker()}} would more 
usually be named {{doTestApplicationLifetimeMonitor()}} or 
{{testApplicationLifetimeMonitorInternal()}}.

"Each app is marked as starved exactly once" seems unclear.  How about, 
"Expected each starved app to be marked as starved exactly once during each of 
the two starvation checks, but the number of starved apps registered is not 
twice the number of starved apps." :)

> TestFSAppStarvation.testPreemptionEnable fails intermittently
> -
>
> Key: YARN-6747
> URL: https://issues.apache.org/jira/browse/YARN-6747
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Miklos Szegedi
> Attachments: YARN-6747.000.patch
>
>
> *Error Message*
> Apps re-added even before starvation delay passed expected:<4> but was:<3>
> *Stacktrace*
> java.lang.AssertionError: Apps re-added even before starvation delay passed 
> expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210088#comment-16210088
 ] 

Miklos Szegedi commented on YARN-7289:
--

Thank you for reviewing the patch. Fair scheduler simply relied on 
AbstractYarnScheduler to pass on the timeout that simply returned -1 ignoring 
the timeout. FairScheduler.checkAndGetApplicationLifetime corrects this. The 
timeout is doubled, since we now run with both fair and capacity schedulers 
once.

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7351) High CPU usage issue in RegistryDNS

2017-10-18 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210085#comment-16210085
 ] 

Jian He edited comment on YARN-7351 at 10/18/17 9:31 PM:
-

[~billie.rinaldi], and I debugged more. In fact, the tcp channel isn't 
working... usually the DNS lookup goes to the UDP channel, hence, the tcp 
channel code was not tested. 
I'll comment out the tcp channel code out and open a separate jira for it. 



was (Author: jianhe):
Billie and I debugged more. In fact, the tcp channel isn't working... usually 
the DNS lookup goes to the UDP channel, hence, the tcp channel code was not 
tested. 
I'll comment out the tcp channel code out and open a separate jira for it. 


> High CPU usage issue in RegistryDNS
> ---
>
> Key: YARN-7351
> URL: https://issues.apache.org/jira/browse/YARN-7351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7351.yarn-native-services.01.patch, 
> YARN-7351.yarn-native-services.02.patch
>
>
> Thanks [~aw] for finding this issue.
> The current RegistryDNS implementation is always running on high CPU and 
> pretty much eats one core. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7351) High CPU usage issue in RegistryDNS

2017-10-18 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210085#comment-16210085
 ] 

Jian He commented on YARN-7351:
---

Billie and I debugged more. In fact, the tcp channel isn't working... usually 
the DNS lookup goes to the UDP channel, hence, the tcp channel code was not 
tested. 
I'll comment out the tcp channel code out and open a separate jira for it. 


> High CPU usage issue in RegistryDNS
> ---
>
> Key: YARN-7351
> URL: https://issues.apache.org/jira/browse/YARN-7351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7351.yarn-native-services.01.patch, 
> YARN-7351.yarn-native-services.02.patch
>
>
> Thanks [~aw] for finding this issue.
> The current RegistryDNS implementation is always running on high CPU and 
> pretty much eats one core. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Issue Type: Improvement  (was: Bug)

> TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly 
> set capacity scheduler
> ---
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> TestZKConfigurationStore and TestLeveldbConfigurationStore both assume 
> capacity scheduler in RM. They should set it explicitly in configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestZKConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Description: TestZKConfigurationStore and Test

> TestZKConfigurationStore and TestZKConfigurationStore should explicitly set 
> capacity scheduler
> --
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> TestZKConfigurationStore and Test



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210078#comment-16210078
 ] 

Daniel Templeton commented on YARN-7289:


I'm confused.  How was the fair scheduler causing the problem?  Wasn't the test 
previously hard-coded to use capacity scheduler?  And if the problem was the 
fair scheduler, which you've now fixed, why are you doubling the timeout?

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Priority: Minor  (was: Major)

> TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly 
> set capacity scheduler
> ---
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>
> TestZKConfigurationStore and TestLeveldbConfigurationStore both assume 
> capacity scheduler in RM. They should set it explicitly in configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Summary: TestZKConfigurationStore and TestLeveldbConfigurationStore should 
explicitly set capacity scheduler  (was: TestZKConfigurationStore and 
TestZKConfigurationStore should explicitly set capacity scheduler)

> TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly 
> set capacity scheduler
> ---
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> TestZKConfigurationStore and Test



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7358) TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7358:
-
Description: TestZKConfigurationStore and TestLeveldbConfigurationStore 
both assume capacity scheduler in RM. They should set it explicitly in 
configuration.  (was: TestZKConfigurationStore and Test)

> TestZKConfigurationStore and TestLeveldbConfigurationStore should explicitly 
> set capacity scheduler
> ---
>
> Key: YARN-7358
> URL: https://issues.apache.org/jira/browse/YARN-7358
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> TestZKConfigurationStore and TestLeveldbConfigurationStore both assume 
> capacity scheduler in RM. They should set it explicitly in configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7358) TestZKConfigurationStore and TestZKConfigurationStore should explicitly set capacity scheduler

2017-10-18 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7358:


 Summary: TestZKConfigurationStore and TestZKConfigurationStore 
should explicitly set capacity scheduler
 Key: YARN-7358
 URL: https://issues.apache.org/jira/browse/YARN-7358
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6747) TestFSAppStarvation.testPreemptionEnable fails intermittently

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210074#comment-16210074
 ] 

Daniel Templeton commented on YARN-6747:


Not a big deal, but {{testApplicationLifetimeMonitorWorker()}} would more 
usually be named {{doTestApplicationLifetimeMonitor()}} or 
{{testApplicationLifetimeMonitorInternal()}}.

"Each app is marked as starved exactly once" seems unclear.  How about, 
"Expected each starved app to be marked as starved exactly once during each of 
the two starvation checks, but the number of starved apps registered is not 
twice the number of starved apps." :)

> TestFSAppStarvation.testPreemptionEnable fails intermittently
> -
>
> Key: YARN-6747
> URL: https://issues.apache.org/jira/browse/YARN-6747
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Miklos Szegedi
> Attachments: YARN-6747.000.patch
>
>
> *Error Message*
> Apps re-added even before starvation delay passed expected:<4> but was:<3>
> *Stacktrace*
> java.lang.AssertionError: Apps re-added even before starvation delay passed 
> expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Issue Comment Deleted] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-7289:
---
Comment: was deleted

(was: Not a big deal, but {{testApplicationLifetimeMonitorWorker()}} would more 
usually be named {{doTestApplicationLifetimeMonitor()}} or 
{{testApplicationLifetimeMonitorInternal()}}.

"Each app is marked as starved exactly once" seems unclear.  How about, 
"Expected each starved app to be marked as starved exactly once during each of 
the two starvation checks, but the number of starved apps registered is not 
twice the number of starved apps." :))

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210068#comment-16210068
 ] 

Daniel Templeton commented on YARN-7289:


Not a big deal, but {{testApplicationLifetimeMonitorWorker()}} would more 
usually be named {{doTestApplicationLifetimeMonitor()}} or 
{{testApplicationLifetimeMonitorInternal()}}.

"Each app is marked as starved exactly once" seems unclear.  How about, 
"Expected each starved app to be marked as starved exactly once during each of 
the two starvation checks, but the number of starved apps registered is not 
twice the number of starved apps." :)

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210057#comment-16210057
 ] 

Hadoop QA commented on YARN-7217:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7217 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892892/YARN-7217.yarn-native-services.004.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18012/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch, 
> YARN-7217.yarn-native-services.002.patch, 
> YARN-7217.yarn-native-services.003.patch, 
> YARN-7217.yarn-native-services.004.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-10-18 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7217:

Attachment: YARN-7217.yarn-native-services.004.patch

Rebased patch on current HEAD of yarn-native-services branch.

> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7217.yarn-native-services.001.patch, 
> YARN-7217.yarn-native-services.002.patch, 
> YARN-7217.yarn-native-services.003.patch, 
> YARN-7217.yarn-native-services.004.patch
>
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210034#comment-16210034
 ] 

Eric Yang commented on YARN-7344:
-

Changes will be in unit test for YARN-7353.

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>Assignee: Eric Badger
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210034#comment-16210034
 ] 

Eric Yang edited comment on YARN-7344 at 10/18/17 9:04 PM:
---

Thank you, Eric. Changes will be in unit test for YARN-7353.


was (Author: eyang):
Changes will be in unit test for YARN-7353.

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>Assignee: Eric Badger
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7353) Docker permitted volumes don't properly check for directories

2017-10-18 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-7353:
--
Description: 
{noformat:title=docker-util.c:check_mount_permitted()}
// directory check
permitted_mount_len = strlen(permitted_mounts[i]);
if (permitted_mount_len > 0
&& permitted_mounts[i][permitted_mount_len - 1] == '/') {
  if (strncmp(normalized_path, permitted_mounts[i], permitted_mount_len) == 
0) {
ret = 1;
break;
  }
}
{noformat}
This code will treat "/home/" as a directory, but not "/home"

{noformat}
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] TestDockerUtil.test_check_mount_permitted
[  FAILED  ] TestDockerUtil.test_normalize_mounts
[  FAILED  ] TestDockerUtil.test_add_rw_mounts
{noformat}
Additionally, YARN-6623 introduced new test failures in the C++ 
container-executor test "cetest"

  was:
{noformat:title=docker-util.c:check_mount_permitted()}
// directory check
permitted_mount_len = strlen(permitted_mounts[i]);
if (permitted_mount_len > 0
&& permitted_mounts[i][permitted_mount_len - 1] == '/') {
  if (strncmp(normalized_path, permitted_mounts[i], permitted_mount_len) == 
0) {
ret = 1;
break;
  }
}
{noformat}
This code will treat "/home/" as a directory, but not "/home"


> Docker permitted volumes don't properly check for directories
> -
>
> Key: YARN-7353
> URL: https://issues.apache.org/jira/browse/YARN-7353
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: YARN-7353.001.patch, YARN-7353.002.patch
>
>
> {noformat:title=docker-util.c:check_mount_permitted()}
> // directory check
> permitted_mount_len = strlen(permitted_mounts[i]);
> if (permitted_mount_len > 0
> && permitted_mounts[i][permitted_mount_len - 1] == '/') {
>   if (strncmp(normalized_path, permitted_mounts[i], permitted_mount_len) 
> == 0) {
> ret = 1;
> break;
>   }
> }
> {noformat}
> This code will treat "/home/" as a directory, but not "/home"
> {noformat}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {noformat}
> Additionally, YARN-6623 introduced new test failures in the C++ 
> container-executor test "cetest"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-7344:
---

Assignee: Eric Badger

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>Assignee: Eric Badger
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger resolved YARN-7344.
---
Resolution: Duplicate
  Assignee: (was: Eric Badger)

Sounds good. I'll dup this JIRA to that one and then update the summary in 
YARN-7353

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger reassigned YARN-7344:
-

Assignee: Eric Badger

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>Assignee: Eric Badger
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow

2017-10-18 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210026#comment-16210026
 ] 

Daniel Templeton commented on YARN-7262:


bq. most of the other properties in YarnConfiguration don't have Javadocs

And if most of the other properties jumped off a bridge, would you do it, too?  
C'mon, give a downstream developer a break.

bq. Added messages to some assert statements

Add some more.  At a bare minimum, please make sure you have messages for all 
{{assertTrue()}} and {{assertFalse()}} calls.

I'm looking carefully at the ZK code now.  Here are my comments:

# I kinda want to suggest that you also use the HIERARCHIES directory just for 
consistency.  It would also make that _if_ to test for bad nodes simpler.
# Should you invert the _if_ statements in {{loadRMDelegationTokenState()}}?  
Right now you're testing first if it starts with the prefix and second if it is 
split.  The net result is that you quietly ignore nodes called 1, 2, 3, and 4, 
even if they're in places where they shouldn't be.
# Why is {{TestZKRMStateStore.getDelegationTokenNode()}} public?
# In {{TestZKRMStateStore.storeUpdateAndVerifyDelegationToken()}}, 
{{renewDate}} doesn't need the explicit boxing.

> Add a hierarchy into the ZKRMStateStore for delegation token znodes to 
> prevent jute buffer overflow
> ---
>
> Key: YARN-7262
> URL: https://issues.apache.org/jira/browse/YARN-7262
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-7262.001.patch, YARN-7262.002.patch
>
>
> We've seen users who are running into a problem where the RM is storing so 
> many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those 
> znodes is higher than the jute buffer. This is fine during operations, but 
> becomes a problem on a fail over because the RM will try to read in all of 
> the token znodes (i.e. call {{getChildren}} on the parent znode).  This is 
> particularly bad because everything appears to be okay, but then if a 
> failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in 
> YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull 
> subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the 
> delegation token znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210024#comment-16210024
 ] 

Eric Yang commented on YARN-7344:
-

[~ebadger] Let's resolve this as part of YARN-7353, and update test case in 
YARN-7353 to include suggestion in this JIRA.

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210006#comment-16210006
 ] 

Eric Yang edited comment on YARN-7344 at 10/18/17 8:58 PM:
---

/usr/bin/touch is probably a better choice from YARN-7353.  [~ebadger] Would 
you like to update your change for YARN-7353 for this JIRA as well.  The lines 
are closely related, and good to have changes based on another changeset.

/etc/localtime is best source for symlink test.


was (Author: eyang):
/usr/bin/touch is probably a better choice from YARN-7353.  [~ebadger] Would 
you like to update your change for YARN-7353 for this JIRA as well.  The lines 
are closely related, and good to have changes based on another changeset.

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210011#comment-16210011
 ] 

Eric Badger commented on YARN-7344:
---

bq. /usr/bin/touch is probably a better choice from YARN-7353. Eric Badger 
Would you like to update your change for YARN-7353 for this JIRA as well. The 
lines are closely related, and good to have changes based on another changeset.
So you want me to dup YARN-7353 to this JIRA and make both changes here?

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210009#comment-16210009
 ] 

Eric Badger commented on YARN-7344:
---

bq. Eric Badger Linux Container Executor is Linux centric. The test case 
doesn't apply to Darwin, does it?
Ah I suppose you're right. That solves that problem then. I'll go ahead and 
assign to myself and fix up the tests using /etc and /etc/passwd

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210006#comment-16210006
 ] 

Eric Yang commented on YARN-7344:
-

/usr/bin/touch is probably a better choice from YARN-7353.  [~ebadger] Would 
you like to update your change for YARN-7353 for this JIRA as well.  The lines 
are closely related, and good to have changes based on another changeset.

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7357) Several methods in TestZKRMStateStore.TestZKRMStateStoreTester.TestZKRMStateStoreInternal should have @Override annotations

2017-10-18 Thread Daniel Templeton (JIRA)

Daniel Templeton created YARN-7357:
--

 Summary: Several methods in 
TestZKRMStateStore.TestZKRMStateStoreTester.TestZKRMStateStoreInternal should 
have @Override annotations
 Key: YARN-7357
 URL: https://issues.apache.org/jira/browse/YARN-7357
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 3.0.0-beta1
Reporter: Daniel Templeton
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7356) The fair scheduler reservation threshold should be documented in the fair scheduler docs

2017-10-18 Thread Daniel Templeton (JIRA)

Daniel Templeton created YARN-7356:
--

 Summary: The fair scheduler reservation threshold should be 
documented in the fair scheduler docs
 Key: YARN-7356
 URL: https://issues.apache.org/jira/browse/YARN-7356
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 3.0.0-beta1
Reporter: Daniel Templeton


See YARN-3920.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7353) Docker permitted volumes don't properly check for directories

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209987#comment-16209987
 ] 

Eric Yang commented on YARN-7353:
-

+1 looks good.

> Docker permitted volumes don't properly check for directories
> -
>
> Key: YARN-7353
> URL: https://issues.apache.org/jira/browse/YARN-7353
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: YARN-7353.001.patch, YARN-7353.002.patch
>
>
> {noformat:title=docker-util.c:check_mount_permitted()}
> // directory check
> permitted_mount_len = strlen(permitted_mounts[i]);
> if (permitted_mount_len > 0
> && permitted_mounts[i][permitted_mount_len - 1] == '/') {
>   if (strncmp(normalized_path, permitted_mounts[i], permitted_mount_len) 
> == 0) {
> ret = 1;
> break;
>   }
> }
> {noformat}
> This code will treat "/home/" as a directory, but not "/home"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209977#comment-16209977
 ] 

Hadoop QA commented on YARN-7289:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
11s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7289 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892883/YARN-7289.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18010/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7344) Unit test for white list mount fails on CentOS 7

2017-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209975#comment-16209975
 ] 

Eric Yang commented on YARN-7344:
-

[~ebadger] Linux Container Executor is Linux centric.  The test case doesn't 
apply to Darwin, does it?

> Unit test for white list mount fails on CentOS 7
> 
>
> Key: YARN-7344
> URL: https://issues.apache.org/jira/browse/YARN-7344
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1
> Environment: CentOS Linux release 7.4.1708 (Core)
> cmake3-3.6.3-1.el7.x86_64
> openjdk version "1.8.0_144"
> OpenJDK Runtime Environment (build 1.8.0_144-b01)
> OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)
> Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
> 2017-04-03T15:39:06-04:00)
> libprotoc 2.5.0
>Reporter: Eric Yang
>
> YARN-6623 introduced ability to turn off docker support for container 
> executor.  When running C++ unit tests, the newly introduced tests failed.
> {code}
> [  FAILED  ] 3 tests, listed below:
> [  FAILED  ] TestDockerUtil.test_check_mount_permitted
> [  FAILED  ] TestDockerUtil.test_normalize_mounts
> [  FAILED  ] TestDockerUtil.test_add_rw_mounts
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-18 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7102:
---
Attachment: YARN-7102-branch-2.8.v9.patch
YARN-7102-branch-2.v9.patch

Sure, patches for branch-2 and 2.8 attached. 

> NM heartbeat stuck when responseId overflows MAX_INT
> 
>
> Key: YARN-7102
> URL: https://issues.apache.org/jira/browse/YARN-7102
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Critical
> Attachments: YARN-7102-branch-2.8.v9.patch, 
> YARN-7102-branch-2.v9.patch, YARN-7102.v1.patch, YARN-7102.v2.patch, 
> YARN-7102.v3.patch, YARN-7102.v4.patch, YARN-7102.v5.patch, 
> YARN-7102.v6.patch, YARN-7102.v7.patch, YARN-7102.v8.patch, YARN-7102.v9.patch
>
>
> ResponseId overflow problem in NM-RM heartbeat. This is same as AM-RM 
> heartbeat in YARN-6640, please refer to YARN-6640 for details. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7355) TestDistributedShell should be scheduler agnostic

2017-10-18 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7355:


 Summary: TestDistributedShell should be scheduler agnostic 
 Key: YARN-7355
 URL: https://issues.apache.org/jira/browse/YARN-7355
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha3
Reporter: Haibo Chen
Assignee: Haibo Chen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7127) Merge yarn-native-service branch into trunk

2017-10-18 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209964#comment-16209964
 ] 

Jian He commented on YARN-7127:
---

bq. Would a user be able to replace the bundled AM with their own and retain 
all of the functionality? If someone wanted to replicate the native services 
features, would they be able to do it using only Public APIs?
it's possible.


> Merge yarn-native-service branch into trunk
> ---
>
> Key: YARN-7127
> URL: https://issues.apache.org/jira/browse/YARN-7127
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7127.01.patch, YARN-7127.02.patch, 
> YARN-7127.03.patch, YARN-7127.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7354) Fair scheduler should support application lifetime monitor

2017-10-18 Thread Miklos Szegedi (JIRA)

Miklos Szegedi created YARN-7354:


 Summary: Fair scheduler should support application lifetime monitor
 Key: YARN-7354
 URL: https://issues.apache.org/jira/browse/YARN-7354
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Miklos Szegedi


For details see the fair scheduler specific code in 
TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor added by YARN-7289



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-7289:
-
Attachment: YARN-7289.001.patch

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch, YARN-7289.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7289) TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out

2017-10-18 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209960#comment-16209960
 ] 

Miklos Szegedi commented on YARN-7289:
--

Indeed, I did some more debugging. The issue was that it ran with fair 
scheduler that overrides the lifetime of the application. I am changing the 
test to test with both capacity and fair schedulers and I also add a small 
change to fair scheduler to handle the basic scenario.

> TestApplicationLifetimeMonitor.testApplicationLifetimeMonitor times out
> ---
>
> Key: YARN-7289
> URL: https://issues.apache.org/jira/browse/YARN-7289
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
> Attachments: YARN-7289.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 131 matches

Mail list logo