[ 
https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607445#comment-16607445
 ] 

Jonathan Hung edited comment on YARN-8200 at 9/7/18 6:06 PM:
-------------------------------------------------------------

Build 
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779 
timed out:
{noformat}cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt
 2>&1
Elapsed:   2m 40s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 2>&1
Elapsed:  15m 20s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
 2>&1
Elapsed:   4m 49s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 2>&1
Elapsed:  79m 41s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
 2>&1
Elapsed:   3m 59s
cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 2>&1
Build timed out (after 500 minutes). Marking the build as aborted.
Build was aborted
Performing Post build task...
Match found for :. : True
Logical operation result is TRUE
Running script  : #!/bin/bash{noformat}

It appears the unit tests hang here: 
(https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt)
{noformat}[INFO] --- maven-compiler-plugin:3.1:testCompile 
(default-testCompile) @ hadoop-yarn-client ---
[INFO] Compiling 34 source files to 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/target/test-classes
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java:[311,6]
 [deprecation] MiniYARNCluster(String,int,int,int,int,boolean) in 
MiniYARNCluster has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[453,16]
 [deprecation] onIncreaseContainerResourceError(ContainerId,Throwable) in 
AbstractCallbackHandler has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[306,16]
 [deprecation] onContainerResourceIncreased(ContainerId,Resource) in 
AbstractCallbackHandler has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java:[205,62]
 [unchecked] unchecked call to handle(T) as a member of the raw type 
EventHandler
[INFO] 
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ 
hadoop-yarn-client ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.yarn.client.TestYarnApiClasses
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.553 s 
- in org.apache.hadoop.yarn.client.TestYarnApiClasses
[INFO] Running org.apache.hadoop.yarn.client.TestRMFailover{noformat}

Though this is similar to the HDFS unit tests hanging in 
HADOOP-15711/HDFS-12711, so I suspect it's not related to the unit test itself. 
Or they are both doing something equally bad.


was (Author: jhung):
Build 
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779 
timed out:
{noformat}cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt
 2>&1
Elapsed:   2m 40s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 2>&1
Elapsed:  15m 20s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
 2>&1
Elapsed:   4m 49s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 2>&1
Elapsed:  79m 41s
cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
 2>&1
Elapsed:   3m 59s
cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
/opt/maven/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0
 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse 
-Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 2>&1
Build timed out (after 500 minutes). Marking the build as aborted.
Build was aborted
Performing Post build task...
Match found for :. : True
Logical operation result is TRUE
Running script  : #!/bin/bash{noformat}

It appears the unit tests hang here: 
(https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt)
{noformat}[INFO] --- maven-compiler-plugin:3.1:testCompile 
(default-testCompile) @ hadoop-yarn-client ---
[INFO] Compiling 34 source files to 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/target/test-classes
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java:[311,6]
 [deprecation] MiniYARNCluster(String,int,int,int,int,boolean) in 
MiniYARNCluster has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[453,16]
 [deprecation] onIncreaseContainerResourceError(ContainerId,Throwable) in 
AbstractCallbackHandler has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[306,16]
 [deprecation] onContainerResourceIncreased(ContainerId,Resource) in 
AbstractCallbackHandler has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java:[205,62]
 [unchecked] unchecked call to handle(T) as a member of the raw type 
EventHandler
[INFO] 
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ 
hadoop-yarn-client ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.yarn.client.TestYarnApiClasses
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.553 s 
- in org.apache.hadoop.yarn.client.TestYarnApiClasses
[INFO] Running org.apache.hadoop.yarn.client.TestRMFailover{noformat}

Though this is similar to the HDFS unit tests hanging in 
HADOOP-15711/HDFS-12711, so I suspect it's not related to the unit test itself.

> Backport resource types/GPU features to branch-2
> ------------------------------------------------
>
>                 Key: YARN-8200
>                 URL: https://issues.apache.org/jira/browse/YARN-8200
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>            Priority: Major
>         Attachments: YARN-8200-branch-2.001.patch, 
> counter.scheduler.operation.allocate.csv.defaultResources, 
> counter.scheduler.operation.allocate.csv.gpuResources, synth_sls.json
>
>
> Currently we have a need for GPU scheduling on our YARN clusters to support 
> deep learning workloads. However, our main production clusters are running 
> older versions of branch-2 (2.7 in our case). To prevent supporting too many 
> very different hadoop versions across multiple clusters, we would like to 
> backport the resource types/resource profiles feature to branch-2, as well as 
> the GPU specific support.
>  
> We have done a trial backport of YARN-3926 and some miscellaneous patches in 
> YARN-7069 based on issues we uncovered, and the backport was fairly smooth. 
> We also did a trial backport of most of YARN-6223 (sans docker support).
>  
> Regarding the backports, perhaps we can do the development in a feature 
> branch and then merge to branch-2 when ready.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to