[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-31 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347158#comment-16347158
 ] 

Eric Badger commented on YARN-7677:
---

+1 (non-binding). Looks good to me. Thanks for fixing this [~Jim_Brennan]

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-30 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345645#comment-16345645
 ] 

Jason Lowe commented on YARN-7677:
--

Thanks for the patch!

+1 lgtm.  I will commit this tomorrow if there are no objections.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch, YARN-7677.002.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345537#comment-16345537
 ] 

genericqa commented on YARN-7677:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 36s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
28s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7677 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908365/YARN-7677.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 20d6d9ece274 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 901d15a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19527/testReport/ |
| Max. process+thread count | 441 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19527/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> HADOOP_CONF_DIR should not be automatically put 

[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-30 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345301#comment-16345301
 ] 

Jim Brennan commented on YARN-7677:
---

I believe the unit test failure (testContainerUpgradeRollbackDueToFailure) is 
unrelated to this change.

I will fix the style issues and submit a new patch.


> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345289#comment-16345289
 ] 

genericqa commented on YARN-7677:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 4 new + 157 unchanged - 0 fixed = 161 total (was 157) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 21s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7677 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908215/YARN-7677.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 31bf9018ad69 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 901d15a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19526/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19526/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server

[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-29 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344137#comment-16344137
 ] 

Jim Brennan commented on YARN-7677:
---

Based on the discussion here and in YARN-7226, and after discussing with 
[~jlowe] and [~ebadger], I have put up a patch for this.

The change is as follows:
 # Remove the line in ContainerLaunch.sh that explicitly adds HADOOP_CONF_DIR 
to the environment (as noted above).
 # Instead of ignoring the whitelist in the case of docker, always add the 
whitelist environment variables that are not already defined in the containers 
context using theĀ {{var:-default}} variable expansion syntax.

In the docker case where a whitelist environment variable is defined in the 
image, this will prevent the launch script from overwriting it with the one 
from the Nodemanager's environment.

The non-docker case behaves the same as before, except that the whitelisted 
environment variables 
 that are not defined by the container context are set using theĀ 
{{var:-default}} syntax, but in this case the default value is always used.


> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-7677.001.patch
>
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316623#comment-16316623
 ] 

Eric Badger commented on YARN-7677:
---

bq. This change will make yarnfile content more consistent that environment 
variable and mounting directories both needs to present in yarnfile to show 
HADOOP_CONF_DIR is exposed to docker container.
Yes, that is correct. I'll go ahead an put up a patch in a little bit once I 
get a free moment.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310385#comment-16310385
 ] 

Eric Yang commented on YARN-7677:
-

[~ebadger] I think I understand the goal now.  Thank you.  In current 
implementation, partial declaration of bind-mount may allow HADOOP_CONF_DIR to 
be sourced implicitly without proper user consent.  This change will make 
yarnfile content more consistent that environment variable and mounting 
directories both needs to present in yarnfile to show {{HADOOP_CONF_DIR}} is 
exposed to docker container.  

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310327#comment-16310327
 ] 

Eric Badger commented on YARN-7677:
---

bq. Some Hadoop features will not work, i.e. short circuits read, if host and 
docker containers are not matching.
This is true. I would like to work towards a solution where we use something 
similar {{dfs.domain.socket.path}}, since it already defines the short-circuit 
socket. However, I'm not sure how to do that without copying the config, since 
this is a dfs property that will be used by the datanode (i.e. not the 
container-executor).

bq. If we handle security properly with white list mount (YARN-5534), 
container-executor validation (YARN-7590), and check sudo privileges before 
launching privileged container (YARN-7221). Any particular reason that we 
shouldn't allow read-only bind-mount HADOOP_CONF_DIR?
Nope, I don't think there is any problem with bind-mounting 
{{HADOOP_CONF_DIR}}. However, I don't think it should be a requirement. For 
example, you should be able to use an older version of hadoop as the client 
(task), while the server (NM) uses a newer version. If we pass in 
{{HADOOP_CONF_DIR}} then this is not possible. If we are constantly 
bind-mounting in hadoop to all of the containers, then we lose some of the 
wonder of docker, which is that the container stays constant and consistent 
over time. Some may choose to bind-mount hadoop, but it should be a choice, not 
a requirement

bq. White list is used by container-executor, which resides in host, and not 
docker container. How is the by pass happens?
This happens because of a call in {{ContainerLaunch.java}} that automatically 
adds {{HADOOP_CONF_DIR}} to the environment. This environment is parsed in 
{{launch_container.sh}}, which is the script that the docker container is 
started with.

{noformat:title=ContainerLaunch.sh}
1388putEnvIfAbsent(environment, Environment.HADOOP_CONF_DIR.name());
{noformat}

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310288#comment-16310288
 ] 

Eric Yang commented on YARN-7677:
-

I think I am stuck on understanding:

{quote}
It completely bypasses the whitelist and so there is no way for a task to not 
have HADOOP_CONF_DIR set. 
{quote}

White list is used by container-executor, which resides in host, and not docker 
container.  How is the by pass happens?

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310273#comment-16310273
 ] 

Eric Yang commented on YARN-7677:
-

[~ebadger] My initial reaction would be to make docker container to follow the 
host layout for single cluster setup.  Some Hadoop features will not work, i.e. 
short circuits read, if host and docker containers are not matching.  For 
docker container to regenerate fine tuned Hadoop configuration to match host, 
it could take some effort from docker container developer.  There is a high 
probability that developers end up using bind-mount to transport Hadoop 
configuration.

If we handle security properly with white list mount (YARN-5534), 
container-executor validation (YARN-7590), and check sudo privileges before 
launching privileged container (YARN-7221).  Any particular reason that we 
shouldn't allow read-only bind-mount {{HADOOP_CONF_DIR}}?  

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310247#comment-16310247
 ] 

Eric Badger commented on YARN-7677:
---

[~eyang], it doesn't necessarily need to be separate hadoop clusters. It could 
just be a node where the NM runs on the bare metal host and the tasks run in 
docker containers. In that case, they would need to know where 
{{HADOOP_CONF_DIR}} is. Since the docker image is completely separate from the 
host layout, we can't assume that hadoop is going to be put in the same place. 
{{HADOOP_CONF_DIR}} isn't getting bind-mounted into the container, so the only 
way this would even work is by a happy coincidence and/or planning the layout 
of the image to match that of the host. But that coupling is certainly not 
necessary and the docker image is the one that actually knows where 
{{HADOOP_CONF_DIR}} is located. The nodemanager knows where its 
{{HADOOP_CONF_DIR}} is located, but that is on the host, not in the docker 
container. 

And again, since {{HADOOP_CONF_DIR}} is in the default env whitelist, the 
behavior here will only change if you explicitly change the env whitelist and 
remove it. So I believe the impact here to be fairly low. Regardless, I don't 
think it's correct for the NM to be defining the layout of the docker image 
(i.e. where {{HADOOP_CONF_DIR}} has to be located). 

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309995#comment-16309995
 ] 

Eric Yang commented on YARN-7677:
-

[~ebadger] I think I understand your scenario better now.  When host and docker 
environments runs two separate Hadoop clusters, we do not want the host Hadoop 
configuration to be exposed to docker because disk settings and file system 
layout do not apply.  Other scenarios, such as HDFS is outside of docker 
container, and running Spark python application in docker container to access 
host level HDFS, Hadoop configuration should be inherited from host to make 
sure well optimized timing settings are exposed.  For huge clusters, the first 
scenario maybe used to isolate virtual clusters.  

For smaller clusters, it is most likely to run mix workload and use docker to 
isolate programming libraries.  Host level node manager white list can not get 
overwritten by container.  I think both cases can be supported, and the default 
is probably inheriting {{HADOOP_CONF_DIR}} for smaller clusters to boost 
efficient utilization of system resource.  It would be better if we build a 
switch for env_reset as part of job submission flag to disable Hadoop system 
environment variable inheritance.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-03 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309906#comment-16309906
 ] 

Eric Badger commented on YARN-7677:
---

[~eyang], the docker container file system layout doesn't have to be the same 
as the host. It's entirely possible for an image to have {{HADOOP_CONF_DIR}} 
set to something like {{/home/conf}}, while the host has it in {{/tmp/conf}}. 
In the current implementation, there is no way for {{HADOOP_CONF_DIR}} to be 
set correctly in this case without the user explicitly setting it with their 
job. Even if you set up {{HADOOP_CONF_DIR}} in the image as an environment 
variable, it will be overridden by the container startup script, since 
{{HADOOP_CONF_DIR}} is being passed in by default regardless. In the case where 
a user doesn't set {{HADOOP_CONF_DIR}} explicitly, it makes sense that the 
docker image will know where to set it (via ENV), while the NM will not, since 
their layouts are not necessarily the same.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2018-01-02 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308357#comment-16308357
 ] 

Eric Yang commented on YARN-7677:
-

[~ebadger] Happy new year.  I think it will be safer for {HADOOP_CONF_DIR} to 
be passed from host to docker image as default.  This is better for preventing 
mistakes instead of allowing override system specific settings at container 
level.  This will also ensure that when an application requires system 
settings, docker doesn't need to reconstruct the environment, but simply mount 
the {HADOOP_CONF_DIR} as source of truth.  If docker container wants to 
generate its own environment, there shouldn't be anything getting in the way 
for docker application to accomplish that.  I don't understand how is this 
paramount for docker case, could you elaborate?  Thanks

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2017-12-20 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298953#comment-16298953
 ] 

Eric Badger commented on YARN-7677:
---

My proposal would be to remove {{HADOOP_CONF_DIR}}, as well as potentially 
{{USER}}, {{LOGNAME}}, {{HOME}}, and {{PWD}} from ContainerLaunch.java and 
require them to be in the environment whitelist if they are to be taken from 
Nodemanager environment. Arguably, these all should be removed, but the 
strongest case can be made for {{HADOOP_CONF_DIR}}, since it is already in the 
default environment whitelist. So the only way this would break a use case is 
if someone was using their own whitelist and didn't include 
{{HADOOP_CONF_DIR}}. 

While this change would be incompatible, I think it makes sense for the 
non-docker case, and is paramount for the docker case.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment

2017-12-20 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298946#comment-16298946
 ] 

Eric Badger commented on YARN-7677:
---

Linking YARN-3611 since this is related to Docker development. Not putting it 
as a subtask, however, because this JIRA has impacts outside of Docker.

> HADOOP_CONF_DIR should not be automatically put in task environment
> ---
>
> Key: YARN-7677
> URL: https://issues.apache.org/jira/browse/YARN-7677
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether 
> it's set by the user or not. It completely bypasses the whitelist and so 
> there is no way for a task to not have {{HADOOP_CONF_DIR}} set. This causes 
> problems in the Docker use case where Docker containers will set up their own 
> environment and have their own {{HADOOP_CONF_DIR}} preset in the image 
> itself. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org