[ https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607445#comment-16607445 ]
Jonathan Hung edited comment on YARN-8200 at 9/7/18 6:06 PM: ------------------------------------------------------------- Build https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779 timed out: {noformat}cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt 2>&1 Elapsed: 2m 40s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt 2>&1 Elapsed: 15m 20s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt 2>&1 Elapsed: 4m 49s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt 2>&1 Elapsed: 79m 41s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt 2>&1 Elapsed: 3m 59s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt 2>&1 Build timed out (after 500 minutes). Marking the build as aborted. Build was aborted Performing Post build task... Match found for :. : True Logical operation result is TRUE Running script : #!/bin/bash{noformat} It appears the unit tests hang here: (https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt) {noformat}[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-yarn-client --- [INFO] Compiling 34 source files to /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/target/test-classes [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java:[311,6] [deprecation] MiniYARNCluster(String,int,int,int,int,boolean) in MiniYARNCluster has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[453,16] [deprecation] onIncreaseContainerResourceError(ContainerId,Throwable) in AbstractCallbackHandler has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[306,16] [deprecation] onContainerResourceIncreased(ContainerId,Resource) in AbstractCallbackHandler has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java:[205,62] [unchecked] unchecked call to handle(T) as a member of the raw type EventHandler [INFO] [INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-yarn-client --- [INFO] [INFO] ------------------------------------------------------- [INFO] T E S T S [INFO] ------------------------------------------------------- [INFO] Running org.apache.hadoop.yarn.client.TestYarnApiClasses [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.553 s - in org.apache.hadoop.yarn.client.TestYarnApiClasses [INFO] Running org.apache.hadoop.yarn.client.TestRMFailover{noformat} Though this is similar to the HDFS unit tests hanging in HADOOP-15711/HDFS-12711, so I suspect it's not related to the unit test itself. Or they are both doing something equally bad. was (Author: jhung): Build https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779 timed out: {noformat}cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt 2>&1 Elapsed: 2m 40s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt 2>&1 Elapsed: 15m 20s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt 2>&1 Elapsed: 4m 49s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt 2>&1 Elapsed: 79m 41s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt 2>&1 Elapsed: 3m 59s cd /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client /opt/maven/bin/mvn --batch-mode -Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-branch-2-patch-0 -Ptest-patch -Pparallel-tests -Pshelltest -Pnative -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt 2>&1 Build timed out (after 500 minutes). Marking the build as aborted. Build was aborted Performing Post build task... Match found for :. : True Logical operation result is TRUE Running script : #!/bin/bash{noformat} It appears the unit tests hang here: (https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/21779/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt) {noformat}[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-yarn-client --- [INFO] Compiling 34 source files to /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/target/test-classes [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java:[311,6] [deprecation] MiniYARNCluster(String,int,int,int,int,boolean) in MiniYARNCluster has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[453,16] [deprecation] onIncreaseContainerResourceError(ContainerId,Throwable) in AbstractCallbackHandler has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java:[306,16] [deprecation] onContainerResourceIncreased(ContainerId,Resource) in AbstractCallbackHandler has been deprecated [WARNING] /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java:[205,62] [unchecked] unchecked call to handle(T) as a member of the raw type EventHandler [INFO] [INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-yarn-client --- [INFO] [INFO] ------------------------------------------------------- [INFO] T E S T S [INFO] ------------------------------------------------------- [INFO] Running org.apache.hadoop.yarn.client.TestYarnApiClasses [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.553 s - in org.apache.hadoop.yarn.client.TestYarnApiClasses [INFO] Running org.apache.hadoop.yarn.client.TestRMFailover{noformat} Though this is similar to the HDFS unit tests hanging in HADOOP-15711/HDFS-12711, so I suspect it's not related to the unit test itself. > Backport resource types/GPU features to branch-2 > ------------------------------------------------ > > Key: YARN-8200 > URL: https://issues.apache.org/jira/browse/YARN-8200 > Project: Hadoop YARN > Issue Type: Task > Reporter: Jonathan Hung > Assignee: Jonathan Hung > Priority: Major > Attachments: YARN-8200-branch-2.001.patch, > counter.scheduler.operation.allocate.csv.defaultResources, > counter.scheduler.operation.allocate.csv.gpuResources, synth_sls.json > > > Currently we have a need for GPU scheduling on our YARN clusters to support > deep learning workloads. However, our main production clusters are running > older versions of branch-2 (2.7 in our case). To prevent supporting too many > very different hadoop versions across multiple clusters, we would like to > backport the resource types/resource profiles feature to branch-2, as well as > the GPU specific support. > > We have done a trial backport of YARN-3926 and some miscellaneous patches in > YARN-7069 based on issues we uncovered, and the backport was fairly smooth. > We also did a trial backport of most of YARN-6223 (sans docker support). > > Regarding the backports, perhaps we can do the development in a feature > branch and then merge to branch-2 when ready. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org