[jira] [Resolved] (YARN-5517) Add GPU as a resource type for scheduling

2018-06-17 Thread Jaeboo Jeong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong resolved YARN-5517.

Resolution: Duplicate

> Add GPU as a resource type for scheduling
> -
>
> Key: YARN-5517
> URL: https://issues.apache.org/jira/browse/YARN-5517
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Jaeboo Jeong
>Priority: Major
> Attachments: RM-scheduler_metrics.jpg, YARN-5517-branch-2.7.1.patch, 
> aggregate_resource_allocation.jpg, container_example.jpg
>
>
> Currently YARN only support scheduling based on memory and cpu.
> There is the issue(YARN-3926) which proposed to extend the YARN resource 
> model.
> And there is the issue(YARN-4122) to add support for GPU as a resource  using 
> docker.
> But these issues didn’t release yet so I just added GPU resource type like 
> memory and cpu.
> I don’t consider GPU isolation like YARN-4122.
> The properties for GPU resource type is similar to cpu core.
> mapred-default.xml
> mapreduce.map.gpu.cores (default 0)
> mapreduce.reduce.gpu.cores(default 0)
> yarn.app.mapreduce.am.resource.gpu-cores (default 0)
> yarn-default.xml
> yarn.scheduler.minimum-allocation-gcores (default 0)  
> yarn.scheduler.maximum-allocation-gcores (default 8)
> yarn.nodemanager.resource.gcores (default 0)
> I attached the patch for branch-2.7.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5517) Add GPU as a resource type for scheduling on branch-2.7.1

2016-08-12 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-5517:
--

 Summary: Add GPU as a resource type for scheduling on branch-2.7.1
 Key: YARN-5517
 URL: https://issues.apache.org/jira/browse/YARN-5517
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jaeboo Jeong


Currently YARN only support scheduling based on memory and cpu.
There is the issue(YARN-3926) which proposed to extend the YARN resource model.
And there is the issue(YARN-4122) to add support for GPU as a resource  using 
docker.

But these issues didn’t release yet so I just added GPU resource type like 
memory and cpu.
I don’t consider GPU isolation like YARN-4122.

The properties for GPU resource type is similar to cpu core.

mapred-default.xml
mapreduce.map.gpu.cores (default 0)
mapreduce.reduce.gpu.cores  (default 0)
yarn.app.mapreduce.am.resource.gpu-cores (default 0)

yarn-default.xml
yarn.scheduler.minimum-allocation-gcores (default 0)
yarn.scheduler.maximum-allocation-gcores (default 8)
yarn.nodemanager.resource.gcores (default 0)

I attached the patch for branch-2.7.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5517) Add GPU as a resource type for scheduling on branch-2.7.1

2016-08-12 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-5517:
---
Attachment: RM-scheduler_metrics.jpg
aggregate_resource_allocation.jpg
container_example.jpg

> Add GPU as a resource type for scheduling on branch-2.7.1
> -
>
> Key: YARN-5517
> URL: https://issues.apache.org/jira/browse/YARN-5517
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
> Attachments: RM-scheduler_metrics.jpg, YARN-5517-branch-2.7.1.patch, 
> aggregate_resource_allocation.jpg, container_example.jpg
>
>
> Currently YARN only support scheduling based on memory and cpu.
> There is the issue(YARN-3926) which proposed to extend the YARN resource 
> model.
> And there is the issue(YARN-4122) to add support for GPU as a resource  using 
> docker.
> But these issues didn’t release yet so I just added GPU resource type like 
> memory and cpu.
> I don’t consider GPU isolation like YARN-4122.
> The properties for GPU resource type is similar to cpu core.
> mapred-default.xml
> mapreduce.map.gpu.cores (default 0)
> mapreduce.reduce.gpu.cores(default 0)
> yarn.app.mapreduce.am.resource.gpu-cores (default 0)
> yarn-default.xml
> yarn.scheduler.minimum-allocation-gcores (default 0)  
> yarn.scheduler.maximum-allocation-gcores (default 8)
> yarn.nodemanager.resource.gcores (default 0)
> I attached the patch for branch-2.7.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5517) Add GPU as a resource type for scheduling on branch-2.7.1

2016-08-12 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-5517:
---
Attachment: YARN-5517-branch-2.7.1.patch

> Add GPU as a resource type for scheduling on branch-2.7.1
> -
>
> Key: YARN-5517
> URL: https://issues.apache.org/jira/browse/YARN-5517
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
> Attachments: YARN-5517-branch-2.7.1.patch
>
>
> Currently YARN only support scheduling based on memory and cpu.
> There is the issue(YARN-3926) which proposed to extend the YARN resource 
> model.
> And there is the issue(YARN-4122) to add support for GPU as a resource  using 
> docker.
> But these issues didn’t release yet so I just added GPU resource type like 
> memory and cpu.
> I don’t consider GPU isolation like YARN-4122.
> The properties for GPU resource type is similar to cpu core.
> mapred-default.xml
> mapreduce.map.gpu.cores (default 0)
> mapreduce.reduce.gpu.cores(default 0)
> yarn.app.mapreduce.am.resource.gpu-cores (default 0)
> yarn-default.xml
> yarn.scheduler.minimum-allocation-gcores (default 0)  
> yarn.scheduler.maximum-allocation-gcores (default 8)
> yarn.nodemanager.resource.gcores (default 0)
> I attached the patch for branch-2.7.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5578) New network outbound configuration is not applied when NM starts

2016-08-29 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-5578:
--

 Summary: New network outbound configuration is not applied when NM 
starts
 Key: YARN-5578
 URL: https://issues.apache.org/jira/browse/YARN-5578
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jaeboo Jeong
Priority: Minor
 Attachments: YARN-5578.patch

I'm testing a network outbound bandwidth shaping by applying YARN-3366.

If yarn.nodemanager.recovery.enabled is true, new configuration is not applied 
when NM starts.
So network outbound bandwidth is always initial setting value.

I just added the code to update states using existing container classes list 
when NM starts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5578) New network outbound configuration is not applied when NM starts

2016-08-29 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-5578:
---
Attachment: YARN-5578.patch

> New network outbound configuration is not applied when NM starts
> 
>
> Key: YARN-5578
> URL: https://issues.apache.org/jira/browse/YARN-5578
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jaeboo Jeong
>Priority: Minor
> Attachments: YARN-5578.patch
>
>
> I'm testing a network outbound bandwidth shaping by applying YARN-3366.
> If yarn.nodemanager.recovery.enabled is true, new configuration is not 
> applied when NM starts.
> So network outbound bandwidth is always initial setting value.
> I just added the code to update states using existing container classes list 
> when NM starts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-08-18 Thread Jaeboo Jeong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6495:
---
Attachment: YARN-6495.003.patch

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch, 
> YARN-6495.003.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-08-18 Thread Jaeboo Jeong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584740#comment-16584740
 ] 

Jaeboo Jeong commented on YARN-6495:


[~Jim_Brennan] Thank you for sharing issue. I checked that and I agree with 
your opinion.

Please close this issue.

Anyway I just uploaded the patch for fixing previous patch.

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch, 
> YARN-6495.003.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-08-18 Thread Jaeboo Jeong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6495:
---
Attachment: (was: YARN-6495.003.patch)

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-08-18 Thread Jaeboo Jeong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6495:
---
Attachment: YARN-6495.003.patch

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch, 
> YARN-6495.003.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-07-31 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107424#comment-16107424
 ] 

Jaeboo Jeong commented on YARN-6494:


Thank you for the feedbacks.

[~dan...@cloudera.com] Current patch was referenced another test cases in 
TestDockerContainerRuntime.
DockerLinuxContainerRuntime write the docker command to a temporary file, so I 
think it's not bad to compare command strings with the contents of the 
temporary file. Of course, it is true that it is difficult to memorize the 
order of commands and make sure that they are correct.

[~shaneku...@gmail.com] I agree with the white list for safe volume mounts.
However I thought this case is different from ordinary file mounts. I think 
that the path for the short circuit is the configuration of the cluster, and 
the cluster user doesn't need to consider about that path. I just think it's 
just part of the cluster setup.

> add mounting of HDFS Short-Circuit path for docker containers
> -
>
> Key: YARN-6494
> URL: https://issues.apache.org/jira/browse/YARN-6494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
> Attachments: YARN-6494.001.patch, YARN-6494.002.patch
>
>
> Currently there is a error message about HDFS short-circuit when docker 
> container start.
> {code}
> WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
> creating DomainSocket
> java.net.ConnectException: connect(2) error: No such file or directory when 
> trying to connect to ‘xxx’
> at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
> ...
> {code}
> if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't 
> equal “”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-08-02 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112098#comment-16112098
 ] 

Jaeboo Jeong commented on YARN-6494:


If we use white list for all kinds of mounting, the cluster administrator has 
to be  concerned with additional settings for the docker environment. And there 
is no way to know what configuration is missed.
I'm not sure it is fine, because after we finished the cluster setup, we expect 
all the containers in the cluster to run in the same environment.

However, I think it is safe for the cluster administrator to be more careful  
because administrator does not know which applications are running. I think it 
would be better to document this later.
So I agree to close this issue.

> add mounting of HDFS Short-Circuit path for docker containers
> -
>
> Key: YARN-6494
> URL: https://issues.apache.org/jira/browse/YARN-6494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
> Attachments: YARN-6494.001.patch, YARN-6494.002.patch
>
>
> Currently there is a error message about HDFS short-circuit when docker 
> container start.
> {code}
> WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
> creating DomainSocket
> java.net.ConnectException: connect(2) error: No such file or directory when 
> trying to connect to ‘xxx’
> at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
> ...
> {code}
> if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't 
> equal “”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-03-23 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412333#comment-16412333
 ] 

Jaeboo Jeong commented on YARN-6495:


It seems to be a problem because the execution of the docker command and the 
writing of the cgroup are executed independently. However, since both tasks are 
not independent, I think it would be better to check the command exit code 
during writing cgroup.

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6495.001.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6434) When setting environment variables, can't use comma for a list of value in key = value pairs.

2018-03-23 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6434:
---
Attachment: YARN-6434-trunk.001.patch

> When setting environment variables, can't use comma for a list of value in 
> key = value pairs.
> -
>
> Key: YARN-6434
> URL: https://issues.apache.org/jira/browse/YARN-6434
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6434-trunk.001.patch, YARN-6434.001.patch
>
>
> We can set environment variables using yarn.app.mapreduce.am.env, 
> mapreduce.map.env, mapreduce.reduce.env.
> There is no problem if we use key=value pairs like X=Y, X=$Y.
> However If we want to set key=a list of value pair(e.g. X=Y,Z), we can’t.
> This is related to YARN-4595.
> The attached patch is based on YARN-3768.
> We can set environment variables like below.
> {code}
> mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=\"/dir1:/targetdir1,/dir2:/targetdir2\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-03-29 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419118#comment-16419118
 ] 

Jaeboo Jeong commented on YARN-6495:


In write_pid_to_cgroup_as_root(), different error cases are already being 
checked except docker command status.

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6495.001.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6434) When setting environment variables, can't use comma for a list of value in key = value pairs.

2018-04-12 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436615#comment-16436615
 ] 

Jaeboo Jeong commented on YARN-6434:


Jim Brennan, Thank you for informing me. I can see variables clearly, so I 
think it is good for readability.

> When setting environment variables, can't use comma for a list of value in 
> key = value pairs.
> -
>
> Key: YARN-6434
> URL: https://issues.apache.org/jira/browse/YARN-6434
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6434-trunk.001.patch, YARN-6434.001.patch
>
>
> We can set environment variables using yarn.app.mapreduce.am.env, 
> mapreduce.map.env, mapreduce.reduce.env.
> There is no problem if we use key=value pairs like X=Y, X=$Y.
> However If we want to set key=a list of value pair(e.g. X=Y,Z), we can’t.
> This is related to YARN-4595.
> The attached patch is based on YARN-3768.
> We can set environment variables like below.
> {code}
> mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=\"/dir1:/targetdir1,/dir2:/targetdir2\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-04-13 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6495:
---
Attachment: YARN-6495.002.patch

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-04-13 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437403#comment-16437403
 ] 

Jaeboo Jeong commented on YARN-6495:


I separated docker container's exit code from cgroup failed exit code. And I 
uploaded the patch file.

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2018-04-21 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447062#comment-16447062
 ] 

Jaeboo Jeong commented on YARN-6495:


[~ebadger], Oops, the patch(002 patch) is wrong. I will fix it on trunk.

And what if you think about the two topics(this issue and improvement of 
write_pid_to_cgroup_as_root()) separately?
As you mentioned, write_pid_to_cgroup_as_root() doesn't differentiat exit codes 
but it is not the reason of this issue.

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6495.001.patch, YARN-6495.002.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5517) Add GPU as a resource type for scheduling

2017-02-26 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884747#comment-15884747
 ] 

Jaeboo Jeong commented on YARN-5517:


This patch just is intended for use with hadoop 2.7.
I agree with Vinod's comment.

> Add GPU as a resource type for scheduling
> -
>
> Key: YARN-5517
> URL: https://issues.apache.org/jira/browse/YARN-5517
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Jaeboo Jeong
> Attachments: aggregate_resource_allocation.jpg, 
> container_example.jpg, RM-scheduler_metrics.jpg, YARN-5517-branch-2.7.1.patch
>
>
> Currently YARN only support scheduling based on memory and cpu.
> There is the issue(YARN-3926) which proposed to extend the YARN resource 
> model.
> And there is the issue(YARN-4122) to add support for GPU as a resource  using 
> docker.
> But these issues didn’t release yet so I just added GPU resource type like 
> memory and cpu.
> I don’t consider GPU isolation like YARN-4122.
> The properties for GPU resource type is similar to cpu core.
> mapred-default.xml
> mapreduce.map.gpu.cores (default 0)
> mapreduce.reduce.gpu.cores(default 0)
> yarn.app.mapreduce.am.resource.gpu-cores (default 0)
> yarn-default.xml
> yarn.scheduler.minimum-allocation-gcores (default 0)  
> yarn.scheduler.maximum-allocation-gcores (default 8)
> yarn.nodemanager.resource.gcores (default 0)
> I attached the patch for branch-2.7.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4595) Add support for configurable read-only mounts when launching Docker containers

2017-03-27 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942914#comment-15942914
 ] 

Jaeboo Jeong commented on YARN-4595:


If I want to use multiple mounts(e.g. /dir1:/targerdir1,/dir2:/targetdir2), it 
couldn't assign variables correctly.
Because during setting environment variables from input string(at Apps.java), 
the comma is used for separator.

example
{code}mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=/dir1:/targetdir1,/dir2:/targetdir2"{code}


> Add support for configurable read-only mounts when launching Docker containers
> --
>
> Key: YARN-4595
> URL: https://issues.apache.org/jira/browse/YARN-4595
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4595.1.patch, YARN-4595.2.patch, YARN-4595.3.patch, 
> YARN-4595.4.patch, YARN-4595.5.patch
>
>
> Mounting files or directories from the host is one way of passing 
> configuration and other information into a docker container.  We could allow 
> the user to set a list of mounts in the environment of ContainerLaunchContext 
> (e.g. /dir1:/targetdir1,/dir2:/targetdir2).  These would be mounted read-only 
> to the specified target locations.
> Due to permissions and user concerns, for this ticket we will require the 
> mounts to be resources that are in the distributed cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6434) When setting environment variables, can't use comma for a list of value in key = value pairs.

2017-04-03 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-6434:
--

 Summary: When setting environment variables, can't use comma for a 
list of value in key = value pairs.
 Key: YARN-6434
 URL: https://issues.apache.org/jira/browse/YARN-6434
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jaeboo Jeong


We can set environment variables using yarn.app.mapreduce.am.env, 
mapreduce.map.env, mapreduce.reduce.env.
There is no problem if we use key=value pairs like X=Y, X=$Y.
However If we want to set key=a list of value pair(e.g. X=Y,Z), we can’t.
This is related to YARN-4595.

The attached patch is based on YARN-3768.
We can set environment variables like below.

{code}
mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=\"/dir1:/targetdir1,/dir2:/targetdir2\""
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6434) When setting environment variables, can't use comma for a list of value in key = value pairs.

2017-04-03 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6434:
---
Attachment: YARN-6434.001.patch

> When setting environment variables, can't use comma for a list of value in 
> key = value pairs.
> -
>
> Key: YARN-6434
> URL: https://issues.apache.org/jira/browse/YARN-6434
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
> Attachments: YARN-6434.001.patch
>
>
> We can set environment variables using yarn.app.mapreduce.am.env, 
> mapreduce.map.env, mapreduce.reduce.env.
> There is no problem if we use key=value pairs like X=Y, X=$Y.
> However If we want to set key=a list of value pair(e.g. X=Y,Z), we can’t.
> This is related to YARN-4595.
> The attached patch is based on YARN-3768.
> We can set environment variables like below.
> {code}
> mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=\"/dir1:/targetdir1,/dir2:/targetdir2\""
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6454) Add support for setting environment variables for docker containers

2017-04-06 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-6454:
--

 Summary: Add support for setting environment variables for docker 
containers
 Key: YARN-6454
 URL: https://issues.apache.org/jira/browse/YARN-6454
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jaeboo Jeong


Docker allows to set environment variables for your containers with the -e flag.
You can set environment variables like below.

{code}
YARN_CONTAINER_RUNTIME_DOCKER_ENVIRONMENT_VARIABLES=“HADOOP_CONF_DIR=$HADOOP_CONF_DIR,HADOOP_HDFS_HOME=/opt/hadoop"
{code}

If you want to set a list of values, apply YARN-6434 first.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6454) Add support for setting environment variables for docker containers

2017-04-06 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6454:
---
Attachment: YARN-6454.001.patch

> Add support for setting environment variables for docker containers
> ---
>
> Key: YARN-6454
> URL: https://issues.apache.org/jira/browse/YARN-6454
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
> Attachments: YARN-6454.001.patch
>
>
> Docker allows to set environment variables for your containers with the -e 
> flag.
> You can set environment variables like below.
> {code}
> YARN_CONTAINER_RUNTIME_DOCKER_ENVIRONMENT_VARIABLES=“HADOOP_CONF_DIR=$HADOOP_CONF_DIR,HADOOP_HDFS_HOME=/opt/hadoop"
> {code}
> If you want to set a list of values, apply YARN-6434 first.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-04-19 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-6494:
--

 Summary: add mounting of HDFS Short-Circuit path for docker 
containers
 Key: YARN-6494
 URL: https://issues.apache.org/jira/browse/YARN-6494
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jaeboo Jeong


Currently there is a error message about HDFS short-circuit when docker 
container start.
{code}
WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
creating DomainSocket
java.net.ConnectException: connect(2) error: No such file or directory when 
trying to connect to ‘xxx’
at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
at 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
...
{code}

if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't equal 
“”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-04-19 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6494:
---
Attachment: YARN-6494.001.patch

> add mounting of HDFS Short-Circuit path for docker containers
> -
>
> Key: YARN-6494
> URL: https://issues.apache.org/jira/browse/YARN-6494
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
> Attachments: YARN-6494.001.patch
>
>
> Currently there is a error message about HDFS short-circuit when docker 
> container start.
> {code}
> WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
> creating DomainSocket
> java.net.ConnectException: connect(2) error: No such file or directory when 
> trying to connect to ‘xxx’
> at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
> ...
> {code}
> if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't 
> equal “”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6495) check docker container's exit code when writing to cgroup task files

2017-04-19 Thread Jaeboo Jeong (JIRA)
Jaeboo Jeong created YARN-6495:
--

 Summary: check docker container's exit code when writing to cgroup 
task files
 Key: YARN-6495
 URL: https://issues.apache.org/jira/browse/YARN-6495
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jaeboo Jeong


If I execute simple command like date on docker container, the application 
failed to complete successfully.

for example, 
{code}
$ yarn  jar 
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
$HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
 -num_containers 1 -timeout 360

…
17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
loop
17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to complete 
successfully
{code}

The error log is like below.
{code}
...
Failed to write pid to file /cgroup_parent/cpu/hadoop-yarn/container_/tasks 
- No such process
...
{code}

When writing pid to cgroup tasks, container-executor doesn’t check docker 
container’s status.
If the container finished very quickly, we can’t write pid to cgroup tasks, and 
it is not problem.
So container-executor needs to check docker container’s exit code during 
writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6495) check docker container's exit code when writing to cgroup task files

2017-04-19 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6495:
---
Attachment: YARN-6495.001.patch

> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
> Attachments: YARN-6495.001.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-04-21 Thread Jaeboo Jeong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaeboo Jeong updated YARN-6494:
---
Attachment: YARN-6494.002.patch

> add mounting of HDFS Short-Circuit path for docker containers
> -
>
> Key: YARN-6494
> URL: https://issues.apache.org/jira/browse/YARN-6494
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
> Attachments: YARN-6494.001.patch, YARN-6494.002.patch
>
>
> Currently there is a error message about HDFS short-circuit when docker 
> container start.
> {code}
> WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
> creating DomainSocket
> java.net.ConnectException: connect(2) error: No such file or directory when 
> trying to connect to ‘xxx’
> at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
> ...
> {code}
> if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't 
> equal “”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6494) add mounting of HDFS Short-Circuit path for docker containers

2017-04-21 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978913#comment-15978913
 ] 

Jaeboo Jeong commented on YARN-6494:


On branch-3.0.0-alpha2, it compiled well.
But on trunk branch, there is a compile error because of HDFS-11596.

I made a new patch.

> add mounting of HDFS Short-Circuit path for docker containers
> -
>
> Key: YARN-6494
> URL: https://issues.apache.org/jira/browse/YARN-6494
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
> Attachments: YARN-6494.001.patch, YARN-6494.002.patch
>
>
> Currently there is a error message about HDFS short-circuit when docker 
> container start.
> {code}
> WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: error 
> creating DomainSocket
> java.net.ConnectException: connect(2) error: No such file or directory when 
> trying to connect to ‘xxx’
> at org.apache.hadoop.net.unix.DomainSocket.connect0(Native Method)
> at org.apache.hadoop.net.unix.DomainSocket.connect(DomainSocket.java:250)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.createSocket(DomainSocketFactory.java:164)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextDomainPeer(BlockReaderFactory.java:752)
> ...
> {code}
> if dfs.client.read.shortcircuit is true and dfs.domain.socket.path isn't 
> equal “”, we need to mount volume for short-circuit path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6495) check docker container's exit code when writing to cgroup task files

2017-04-25 Thread Jaeboo Jeong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983033#comment-15983033
 ] 

Jaeboo Jeong commented on YARN-6495:


I tested and developed on branch-2.7 because my cluster is running with 2.7.
To use docker container in LCE, I patched all issues in YARN-3611.

However I tested based on the branch-2 
commit(3b7bb7b94b1974e74556e787e6bec7549040b3a5), the application failed to 
complete successfully too. I didn't do anything to getting failure.


> check docker container's exit code when writing to cgroup task files
> 
>
> Key: YARN-6495
> URL: https://issues.apache.org/jira/browse/YARN-6495
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Jaeboo Jeong
>Assignee: Jaeboo Jeong
> Attachments: YARN-6495.001.patch
>
>
> If I execute simple command like date on docker container, the application 
> failed to complete successfully.
> for example, 
> {code}
> $ yarn  jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar 
> $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar
>  -num_containers 1 -timeout 360
> …
> 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to 
> complete successfully
> {code}
> The error log is like below.
> {code}
> ...
> Failed to write pid to file 
> /cgroup_parent/cpu/hadoop-yarn/container_/tasks - No such process
> ...
> {code}
> When writing pid to cgroup tasks, container-executor doesn’t check docker 
> container’s status.
> If the container finished very quickly, we can’t write pid to cgroup tasks, 
> and it is not problem.
> So container-executor needs to check docker container’s exit code during 
> writing pid to cgroup tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org