[jira] [Commented] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
[ https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821959#comment-17821959 ] Bilwa S T commented on YARN-11654: -- cc [~slfan1989] [~steve_l] > [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails > > > Key: YARN-11654 > URL: https://issues.apache.org/jira/browse/YARN-11654 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Labels: pull-request-available > > [ERROR] TestLinuxContainerExecutorWithMocks.testStartLocalizer:310 > Expected size:<26> but was:<28> in: > <["nobody", > "test", > "0", > "application_0", > "12345", > "/bin/nmPrivateCTokensPath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir", > "src/test/resources", > > "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java", > "-classpath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/test-classes:/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes:/Users/bilwa/.m2/repository/org/apache/hadoop/hadoop-common/3.5.0-SNAPSHOT/hadoop-common-3.5.0-SNAPSHOT.jar:/Users/bilwa/.m2/repository/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar:/Users/bilwa/.m2/repository/com/google/guava/guava/27.0-jre/guava-27.0-jre.jar:/Users/bilwa/.m2/repository/com/google/guava/failureaccess/1.0/failureaccess-1.0.jar:/Users/bilwa/.m2/repository/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar:/Users/bilwa/.m2/repository/org/checkerframework/checker-qual/2.5.2/checker-qual-2.5.2.jar:/Users/bilwa/.m2/repository/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar:/Users/bilwa/.m2/repository/org/codehaus/mojo/animal-sniffer-annotations/1.17/animal-sniffer-annotations-1.17.jar:/Users/bilwa/.m2/repository/commons-cli/commons-cli/1.5.0/commons-cli-1.5.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar:/Users/bilwa/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/bilwa/.m2/repository/commons-io/commons-io/2.14.0/commons-io-2.14.0.jar:/Users/bilwa/.m2/repository/commons-net/commons-net/3.9.0/commons-net-3.9.0.jar:/Users/bilwa/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/Users/bilwa/.m2/repository/jakarta/activation/jakarta.activation-api/1.2.1/jakarta.activation-api-1.2.1.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-server/9.4.53.v20231009/jetty-server-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-http/9.4.53.v20231009/jetty-http-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-io/9.4.53.v20231009/jetty-io-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-servlet/9.4.53.v20231009/jetty-servlet-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-security/9.4.53.v20231009/jetty-security-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-util-ajax/9.4.53.v20231009/jetty-util-ajax-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-webapp/9.4.53.v20231009/jetty-webapp-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-xml/9.4.53.v20231009/jetty-xml-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-servlet/1.19.4/jersey-servlet-1.19.4.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-server/1.19.4/jersey-server-1.19.4.jar:/Users/bilwa/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/bilwa/.m2/repository/commons-beanutils/commons-beanutils/1.9.4/commons-beanutils-1.9.4.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-configuration2/2.8.0/commons-configuration2-2.8.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-text/1.10.0/commons-text-1.10.0.jar:/Users/bilwa/.m2/repository/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-1.7.30.jar:/Users/bilwa/.m2/repository/org/apache/avro/avro/1.9.2/avro-1.9.2.jar:/Users/bilwa/.m2/repository/com/google/re2j/re2j/1.1
[jira] [Comment Edited] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
[ https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814336#comment-17814336 ] Bilwa S T edited comment on YARN-11654 at 2/5/24 12:23 PM: --- cc [~snemeth] [~ayushsaxena] [~brahmareddy] was (Author: bilwast): cc [~snemeth][~ayushsaxena][~brahmareddy] > [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails > > > Key: YARN-11654 > URL: https://issues.apache.org/jira/browse/YARN-11654 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Labels: pull-request-available > > [ERROR] TestLinuxContainerExecutorWithMocks.testStartLocalizer:310 > Expected size:<26> but was:<28> in: > <["nobody", > "test", > "0", > "application_0", > "12345", > "/bin/nmPrivateCTokensPath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir", > "src/test/resources", > > "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java", > "-classpath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/test-classes:/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes:/Users/bilwa/.m2/repository/org/apache/hadoop/hadoop-common/3.5.0-SNAPSHOT/hadoop-common-3.5.0-SNAPSHOT.jar:/Users/bilwa/.m2/repository/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar:/Users/bilwa/.m2/repository/com/google/guava/guava/27.0-jre/guava-27.0-jre.jar:/Users/bilwa/.m2/repository/com/google/guava/failureaccess/1.0/failureaccess-1.0.jar:/Users/bilwa/.m2/repository/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar:/Users/bilwa/.m2/repository/org/checkerframework/checker-qual/2.5.2/checker-qual-2.5.2.jar:/Users/bilwa/.m2/repository/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar:/Users/bilwa/.m2/repository/org/codehaus/mojo/animal-sniffer-annotations/1.17/animal-sniffer-annotations-1.17.jar:/Users/bilwa/.m2/repository/commons-cli/commons-cli/1.5.0/commons-cli-1.5.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar:/Users/bilwa/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/bilwa/.m2/repository/commons-io/commons-io/2.14.0/commons-io-2.14.0.jar:/Users/bilwa/.m2/repository/commons-net/commons-net/3.9.0/commons-net-3.9.0.jar:/Users/bilwa/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/Users/bilwa/.m2/repository/jakarta/activation/jakarta.activation-api/1.2.1/jakarta.activation-api-1.2.1.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-server/9.4.53.v20231009/jetty-server-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-http/9.4.53.v20231009/jetty-http-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-io/9.4.53.v20231009/jetty-io-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-servlet/9.4.53.v20231009/jetty-servlet-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-security/9.4.53.v20231009/jetty-security-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-util-ajax/9.4.53.v20231009/jetty-util-ajax-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-webapp/9.4.53.v20231009/jetty-webapp-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-xml/9.4.53.v20231009/jetty-xml-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-servlet/1.19.4/jersey-servlet-1.19.4.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-server/1.19.4/jersey-server-1.19.4.jar:/Users/bilwa/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/bilwa/.m2/repository/commons-beanutils/commons-beanutils/1.9.4/commons-beanutils-1.9.4.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-configuration2/2.8.0/commons-configuration2-2.8.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-text/1.10.0/commons-text-1.10.0.jar:/Users/bilwa/.m2/repository/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-
[jira] [Commented] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
[ https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814336#comment-17814336 ] Bilwa S T commented on YARN-11654: -- cc [~snemeth][~ayushsaxena][~brahmareddy] > [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails > > > Key: YARN-11654 > URL: https://issues.apache.org/jira/browse/YARN-11654 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Labels: pull-request-available > > [ERROR] TestLinuxContainerExecutorWithMocks.testStartLocalizer:310 > Expected size:<26> but was:<28> in: > <["nobody", > "test", > "0", > "application_0", > "12345", > "/bin/nmPrivateCTokensPath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir", > "src/test/resources", > > "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java", > "-classpath", > > "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/test-classes:/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes:/Users/bilwa/.m2/repository/org/apache/hadoop/hadoop-common/3.5.0-SNAPSHOT/hadoop-common-3.5.0-SNAPSHOT.jar:/Users/bilwa/.m2/repository/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar:/Users/bilwa/.m2/repository/com/google/guava/guava/27.0-jre/guava-27.0-jre.jar:/Users/bilwa/.m2/repository/com/google/guava/failureaccess/1.0/failureaccess-1.0.jar:/Users/bilwa/.m2/repository/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar:/Users/bilwa/.m2/repository/org/checkerframework/checker-qual/2.5.2/checker-qual-2.5.2.jar:/Users/bilwa/.m2/repository/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar:/Users/bilwa/.m2/repository/org/codehaus/mojo/animal-sniffer-annotations/1.17/animal-sniffer-annotations-1.17.jar:/Users/bilwa/.m2/repository/commons-cli/commons-cli/1.5.0/commons-cli-1.5.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar:/Users/bilwa/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/bilwa/.m2/repository/commons-io/commons-io/2.14.0/commons-io-2.14.0.jar:/Users/bilwa/.m2/repository/commons-net/commons-net/3.9.0/commons-net-3.9.0.jar:/Users/bilwa/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/Users/bilwa/.m2/repository/jakarta/activation/jakarta.activation-api/1.2.1/jakarta.activation-api-1.2.1.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-server/9.4.53.v20231009/jetty-server-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-http/9.4.53.v20231009/jetty-http-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-io/9.4.53.v20231009/jetty-io-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-servlet/9.4.53.v20231009/jetty-servlet-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-security/9.4.53.v20231009/jetty-security-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-util-ajax/9.4.53.v20231009/jetty-util-ajax-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-webapp/9.4.53.v20231009/jetty-webapp-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-xml/9.4.53.v20231009/jetty-xml-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-servlet/1.19.4/jersey-servlet-1.19.4.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-server/1.19.4/jersey-server-1.19.4.jar:/Users/bilwa/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/bilwa/.m2/repository/commons-beanutils/commons-beanutils/1.9.4/commons-beanutils-1.9.4.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-configuration2/2.8.0/commons-configuration2-2.8.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-text/1.10.0/commons-text-1.10.0.jar:/Users/bilwa/.m2/repository/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-1.7.30.jar:/Users/bilwa/.m2/repository/org/apache/avro/avro/1.9.2/avro-1.9.2.jar:/Users/bilwa/.m2/repository/com/googl
[jira] [Updated] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
[ https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-11654: - Description: [ERROR] TestLinuxContainerExecutorWithMocks.testStartLocalizer:310 Expected size:<26> but was:<28> in: <["nobody", "test", "0", "application_0", "12345", "/bin/nmPrivateCTokensPath", "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir", "src/test/resources", "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java", "-classpath", "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/test-classes:/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes:/Users/bilwa/.m2/repository/org/apache/hadoop/hadoop-common/3.5.0-SNAPSHOT/hadoop-common-3.5.0-SNAPSHOT.jar:/Users/bilwa/.m2/repository/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar:/Users/bilwa/.m2/repository/com/google/guava/guava/27.0-jre/guava-27.0-jre.jar:/Users/bilwa/.m2/repository/com/google/guava/failureaccess/1.0/failureaccess-1.0.jar:/Users/bilwa/.m2/repository/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar:/Users/bilwa/.m2/repository/org/checkerframework/checker-qual/2.5.2/checker-qual-2.5.2.jar:/Users/bilwa/.m2/repository/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar:/Users/bilwa/.m2/repository/org/codehaus/mojo/animal-sniffer-annotations/1.17/animal-sniffer-annotations-1.17.jar:/Users/bilwa/.m2/repository/commons-cli/commons-cli/1.5.0/commons-cli-1.5.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/Users/bilwa/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar:/Users/bilwa/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/bilwa/.m2/repository/commons-io/commons-io/2.14.0/commons-io-2.14.0.jar:/Users/bilwa/.m2/repository/commons-net/commons-net/3.9.0/commons-net-3.9.0.jar:/Users/bilwa/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/Users/bilwa/.m2/repository/jakarta/activation/jakarta.activation-api/1.2.1/jakarta.activation-api-1.2.1.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-server/9.4.53.v20231009/jetty-server-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-http/9.4.53.v20231009/jetty-http-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-io/9.4.53.v20231009/jetty-io-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-servlet/9.4.53.v20231009/jetty-servlet-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-security/9.4.53.v20231009/jetty-security-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-util-ajax/9.4.53.v20231009/jetty-util-ajax-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-webapp/9.4.53.v20231009/jetty-webapp-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/org/eclipse/jetty/jetty-xml/9.4.53.v20231009/jetty-xml-9.4.53.v20231009.jar:/Users/bilwa/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-servlet/1.19.4/jersey-servlet-1.19.4.jar:/Users/bilwa/.m2/repository/com/sun/jersey/jersey-server/1.19.4/jersey-server-1.19.4.jar:/Users/bilwa/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/bilwa/.m2/repository/commons-beanutils/commons-beanutils/1.9.4/commons-beanutils-1.9.4.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-configuration2/2.8.0/commons-configuration2-2.8.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar:/Users/bilwa/.m2/repository/org/apache/commons/commons-text/1.10.0/commons-text-1.10.0.jar:/Users/bilwa/.m2/repository/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-1.7.30.jar:/Users/bilwa/.m2/repository/org/apache/avro/avro/1.9.2/avro-1.9.2.jar:/Users/bilwa/.m2/repository/com/google/re2j/re2j/1.1/re2j-1.1.jar:/Users/bilwa/.m2/repository/com/google/code/gson/gson/2.9.0/gson-2.9.0.jar:/Users/bilwa/.m2/repository/org/apache/hadoop/hadoop-auth/3.5.0-SNAPSHOT/hadoop-auth-3.5.0-SNAPSHOT.jar:/Users/bilwa/.m2/repository/com/nimbusds/nimbus-jose-jwt/9.31/nimbus-jose-jwt-9.31.jar:/Users/bilwa/.m2/repository/com/github/stephenc/jcip/jcip-annotations/1.0-1/jcip-annotations-1.0-1.jar:/Users/bilwa/.m2/repository/com/jcraft/jsch/0.1.55/jsch-0.1.55.jar:/Users/bilwa/.m2/repository/org/apache/curator/curator-client/5.2.0/curator-client-5.2.0.jar:/Users/bilwa/.m2/repository/org/apache/cu
[jira] [Created] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
Bilwa S T created YARN-11654: Summary: [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails Key: YARN-11654 URL: https://issues.apache.org/jira/browse/YARN-11654 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.4.0 Reporter: Bilwa S T Assignee: Bilwa S T Expected size:<26> but was:<28> in: <["nobody", "test", "0", "application_0", "12345", "/bin/nmPrivateCTokensPath", "/workspace/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir", "src/test/resources", "/usr/lib/jvm/jdk-17.0.9/bin/java", "-classpath", "/workspace/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/test-classes:/workspace/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes:/home/mwapp/.m2/repository/org/apache/hadoop/hadoop-common/3.3.6-13/hadoop-common-3.3.6-13.jar:/home/mwapp/.m2/repository/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar:/home/mwapp/.m2/repository/com/google/guava/guava/32.0.1-jre/guava-32.0.1-jre.jar:/home/mwapp/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar:/home/mwapp/.m2/repository/com/google/guava/listenablefuture/.0-empty-to-avoid-conflict-with-guava/listenablefuture-.0-empty-to-avoid-conflict-with-guava.jar:/home/mwapp/.m2/repository/org/checkerframework/checker-qual/3.33.0/checker-qual-3.33.0.jar:/home/mwapp/.m2/repository/com/google/j2objc/j2objc-annotations/2.8/j2objc-annotations-2.8.jar:/home/mwapp/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/mwapp/.m2/repository/org/apache/commons/commons-math3/3.1.1/commons-math3-3.1.1.jar:/home/mwapp/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/home/mwapp/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar:/home/mwapp/.m2/repository/commons-io/commons-io/2.8.0/commons-io-2.8.0.jar:/home/mwapp/.m2/repository/commons-net/commons-net/3.9.0/commons-net-3.9.0.jar:/home/mwapp/.m2/repository/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.jar:/home/mwapp/.m2/repository/jakarta/activation/jakarta.activation-api/1.2.1/jakarta.activation-api-1.2.1.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-server/9.4.53.v20231009/jetty-server-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-http/9.4.53.v20231009/jetty-http-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-io/9.4.53.v20231009/jetty-io-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-servlet/9.4.53.v20231009/jetty-servlet-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-security/9.4.53.v20231009/jetty-security-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-util-ajax/9.4.53.v20231009/jetty-util-ajax-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-webapp/9.4.53.v20231009/jetty-webapp-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/org/eclipse/jetty/jetty-xml/9.4.53.v20231009/jetty-xml-9.4.53.v20231009.jar:/home/mwapp/.m2/repository/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/home/mwapp/.m2/repository/com/sun/jersey/jersey-servlet/1.19.4/jersey-servlet-1.19.4.jar:/home/mwapp/.m2/repository/com/sun/jersey/jersey-server/1.19.4/jersey-server-1.19.4.jar:/home/mwapp/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/home/mwapp/.m2/repository/ch/qos/reload4j/reload4j/1.2.22/reload4j-1.2.22.jar:/home/mwapp/.m2/repository/commons-beanutils/commons-beanutils/1.9.4/commons-beanutils-1.9.4.jar:/home/mwapp/.m2/repository/org/apache/commons/commons-configuration2/2.8.0/commons-configuration2-2.8.0.jar:/home/mwapp/.m2/repository/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar:/home/mwapp/.m2/repository/org/apache/commons/commons-text/1.10.0/commons-text-1.10.0.jar:/home/mwapp/.m2/repository/org/apache/avro/avro/1.7.7/avro-1.7.7.jar:/home/mwapp/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/home/mwapp/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/home/mwapp/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/home/mwapp/.m2/repository/com/google/re2j/re2j/1.1/re2j-1.1.jar:/home/mwapp/.m2/repository/com/google/code/gson/gson/2.9.0/gson-2.9.0.jar:/home/mwapp/.m2/repository/org/apache/hadoop/hadoop-auth/3.3.6-13/hadoop-auth-3.3.6-13.jar:/home/mwapp/.m2/repository/com/nimbusds/nimbus-jose-jwt/9.8.1/nimbus-jose-jwt-9.8.1.jar:/home/mwapp/.m2/repository/com/github/stephenc/jcip/jcip-annotations/1.0-1/jcip-annotations-1.0-1.jar:/home/mwapp/.m2/repository/org/apache/kerby/kerb-simplekdc/1.0.1/kerb-simplekdc-1.
[jira] [Commented] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released
[ https://issues.apache.org/jira/browse/YARN-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554035#comment-17554035 ] Bilwa S T commented on YARN-11181: -- cc [~bibinchundatt] [~brahma] [~prabhujoseph] > Applications in Pending state as AM resources are not updated when resources > from other queue gets released > --- > > Key: YARN-11181 > URL: https://issues.apache.org/jira/browse/YARN-11181 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Priority: Major > > Configure two queues q1 and q2. > Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. > 1. Run long running application to q1 which occupies 70% of cluster resources > 2. Run small application to q2 . > 3. Run one long running job to q2 and few more small jobs. > 4. Once small application submitted to q2 finishes , AM resources gets > decreased to <2gb, 2vcores> > 5. Kill long running application submitted to q1. > Now long running job submitted to q2 will be running and all other jobs are > in pending state. > This is because LeafQueue#ActivateApplications gets called only when AM > starts running or finishes. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released
[ https://issues.apache.org/jira/browse/YARN-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-11181: - Description: Configure two queues q1 and q2. Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. 1. Run long running application to q1 which occupies 70% of cluster resources 2. Run small application to q2 . 3. Run one long running job to q2 and few more small jobs. 4. Once small application submitted to q2 finishes , AM resources gets decreased to <2gb, 2vcores> 5. Kill long running application submitted to q1. Now long running job submitted to q2 will be running and all other jobs are in pending state. This is because LeafQueue#ActivateApplications gets called only when AM starts running or finishes. > Applications in Pending state as AM resources are not updated when resources > from other queue gets released > --- > > Key: YARN-11181 > URL: https://issues.apache.org/jira/browse/YARN-11181 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Priority: Major > > Configure two queues q1 and q2. > Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. > 1. Run long running application to q1 which occupies 70% of cluster resources > 2. Run small application to q2 . > 3. Run one long running job to q2 and few more small jobs. > 4. Once small application submitted to q2 finishes , AM resources gets > decreased to <2gb, 2vcores> > 5. Kill long running application submitted to q1. > Now long running job submitted to q2 will be running and all other jobs are > in pending state. > This is because LeafQueue#ActivateApplications gets called only when AM > starts running or finishes. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released
Bilwa S T created YARN-11181: Summary: Applications in Pending state as AM resources are not updated when resources from other queue gets released Key: YARN-11181 URL: https://issues.apache.org/jira/browse/YARN-11181 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419162#comment-17419162 ] Bilwa S T commented on YARN-9606: - Thanks [~pbacsko] > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0, 3.3.2 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17417613#comment-17417613 ] Bilwa S T commented on YARN-9606: - Hi [~pbacsko] can we merge this? > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10812) yarn service number of containers count is wrong when flexing
[ https://issues.apache.org/jira/browse/YARN-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389219#comment-17389219 ] Bilwa S T commented on YARN-10812: -- cc [~eyang] > yarn service number of containers count is wrong when flexing > - > > Key: YARN-10812 > URL: https://issues.apache.org/jira/browse/YARN-10812 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > Currently let's say there are 2 containers running in a service. > User ask for 2 more by flexing it and there is just resources available only > for 1 container to run but still number of containers will be updated to 4 > which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368815#comment-17368815 ] Bilwa S T commented on YARN-10824: -- Thanks [~Jim_Brennan] [~epayne] for your review comments. I have updated patch. Please check > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch, > YARN-10824.002.patch > > > The following issue was reported by one of our internal web security check > tools: > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > [https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22] > or > [https://[hostname]:[nm_port]/node?title=12345] > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10824: - Attachment: YARN-10824.002.patch > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch, > YARN-10824.002.patch > > > The following issue was reported by one of our internal web security check > tools: > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > [https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22] > or > [https://[hostname]:[nm_port]/node?title=12345] > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9606: Attachment: YARN-9606-branch-3.3-v2.patch > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364992#comment-17364992 ] Bilwa S T commented on YARN-10824: -- cc [~jbrennan] [~epayne] > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch > > > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22 > or > https://[hostname]:[nm_port]/node?title=12345 > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10824: - Attachment: YARN-10824.001.patch > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch > > > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22 > or > https://[hostname]:[nm_port]/node?title=12345 > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364990#comment-17364990 ] Bilwa S T commented on YARN-10824: -- Command injection can happen here. So to avoid that we can just set title to JHS and NM page > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg > > > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22 > or > https://[hostname]:[nm_port]/node?title=12345 > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10824) Title not set for JHS and NM webpages
[ https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10824: Assignee: Bilwa S T > Title not set for JHS and NM webpages > - > > Key: YARN-10824 > URL: https://issues.apache.org/jira/browse/YARN-10824 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rajshree Mishra >Assignee: Bilwa S T >Priority: Major > Attachments: JHS URL.jpg, NM URL.jpg > > > Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a > url similar to: > https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22 > or > https://[hostname]:[nm_port]/node?title=12345 > sets the page title to be set to the value mentioned. > [Image attached] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10800) Yarn service container should be removed from list when its completed/stopped
[ https://issues.apache.org/jira/browse/YARN-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10800: - Attachment: YARN-10800.001.patch > Yarn service container should be removed from list when its completed/stopped > - > > Key: YARN-10800 > URL: https://issues.apache.org/jira/browse/YARN-10800 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-10800.001.patch > > > When we query for containerlist using ServiceClient.getStatus . The list > returned containers even finished containers. Currently finished containers > are removed only when flex down is done but not when container is > shutdown/completed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10812) yarn service number of containers count is wrong when flexing
Bilwa S T created YARN-10812: Summary: yarn service number of containers count is wrong when flexing Key: YARN-10812 URL: https://issues.apache.org/jira/browse/YARN-10812 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T Currently let's say there are 2 containers running in a service. User ask for 2 more by flexing it and there is just resources available only for 1 container to run but still number of containers will be updated to 4 which is wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times
[ https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355634#comment-17355634 ] Bilwa S T commented on YARN-10767: -- Hi [~dmmkr] Thanks for patch. I have one minor comment: * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I think we should have null check here. [~jbrennan] can you please take a look at this issue? > Yarn Logs Command retrying on Standby RM for 30 times > - > > Key: YARN-10767 > URL: https://issues.apache.org/jira/browse/YARN-10767 > Project: Hadoop YARN > Issue Type: Bug >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Major > Attachments: YARN-10767.001.patch > > > When ResourceManager HA is enabled and the first RM is unavailable, on > executing "bin/yarn logs -applicationId -am 1", we get > ConnectionException for connecting to the first RM, the ConnectionException > Occurs for 30 times before it tries to connect to the second RM. > > This can be optimized by trying to fetch the logs from the Active RM. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10800) Yarn service container should be removed from list when its completed/stopped
Bilwa S T created YARN-10800: Summary: Yarn service container should be removed from list when its completed/stopped Key: YARN-10800 URL: https://issues.apache.org/jira/browse/YARN-10800 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T When we query for containerlist using ServiceClient.getStatus . The list returned containers even finished containers. Currently finished containers are removed only when flex down is done but not when container is shutdown/completed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353772#comment-17353772 ] Bilwa S T commented on YARN-9606: - Hi [~pbacsko] i have attached patch for branch-3.3 . It was failing because new classes had some code difference. > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, > YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, > YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9606: Attachment: YARN-9606-branch-3.3.v1.patch > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, > YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, > YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9606: Attachment: YARN-9606-branch-3.3.v1.patch.patch > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, > YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, > YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9606: Attachment: (was: YARN-9606-branch-3.3.v1.patch.patch) > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, > YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, > YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times
[ https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10767: - Comment: was deleted (was: Hi [~dmmkr] Thanks for patch. I have one minor comment: * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I think we should have null check here. [~jbrennan] can you please take a look at this issue?) > Yarn Logs Command retrying on Standby RM for 30 times > - > > Key: YARN-10767 > URL: https://issues.apache.org/jira/browse/YARN-10767 > Project: Hadoop YARN > Issue Type: Bug >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Major > Attachments: YARN-10767.001.patch > > > When ResourceManager HA is enabled and the first RM is unavailable, on > executing "bin/yarn logs -applicationId -am 1", we get > ConnectionException for connecting to the first RM, the ConnectionException > Occurs for 30 times before it tries to connect to the second RM. > > This can be optimized by trying to fetch the logs from the Active RM. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times
[ https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351660#comment-17351660 ] Bilwa S T commented on YARN-10767: -- Hi [~dmmkr] Thanks for patch. I have one minor comment: * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I think we should have null check here. [~jbrennan] can you please take a look at this issue? > Yarn Logs Command retrying on Standby RM for 30 times > - > > Key: YARN-10767 > URL: https://issues.apache.org/jira/browse/YARN-10767 > Project: Hadoop YARN > Issue Type: Bug >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Major > Attachments: YARN-10767.001.patch > > > When ResourceManager HA is enabled and the first RM is unavailable, on > executing "bin/yarn logs -applicationId -am 1", we get > ConnectionException for connecting to the first RM, the ConnectionException > Occurs for 30 times before it tries to connect to the second RM. > > This can be optimized by trying to fetch the logs from the Active RM. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347692#comment-17347692 ] Bilwa S T commented on YARN-9606: - [~pbacsko] Can you please backport this too? as this had dependency on YARN-10120 we were not able to backport this. Thank you > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347653#comment-17347653 ] Bilwa S T commented on YARN-10725: -- There is no major change. You can keep commit msg same as in trunk [~pbacsko]. Thank you > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, > YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, > image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347627#comment-17347627 ] Bilwa S T commented on YARN-10725: -- Yes [~pbacsko]. but one whitespace is there which needs to be fixed. Shall upload new patch fixing it? > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, > YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, > image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347338#comment-17347338 ] Bilwa S T commented on YARN-10725: -- Hi [~brahmareddy] [~pbacsko] can you please check latest patch? i think we can ignore checkstyle issues. will fix whitespace issue > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, > YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, > image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.v5.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, > YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, > image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346577#comment-17346577 ] Bilwa S T commented on YARN-10258: -- Thanks [~pbacsko] for committing this. please backport it to 3.3.1 also. > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-10258-001.patch, YARN-10258-002.patch, > YARN-10258-003.patch, YARN-10258-005.patch, YARN-10258-006.patch, > YARN-10258-007.patch, YARN-10258-008.patch, YARN-10258-009.patch, > YARN-10258-010.patch, YARN-10258_004.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346112#comment-17346112 ] Bilwa S T commented on YARN-10258: -- Thanks [~gb.ana...@gmail.com] for patch. Looks good to me. [~pbacsko] can you please commit this? > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch, > YARN-10258-003.patch, YARN-10258-005.patch, YARN-10258-006.patch, > YARN-10258-007.patch, YARN-10258-008.patch, YARN-10258-009.patch, > YARN-10258-010.patch, YARN-10258_004.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.v4.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, > image-2021-04-05-16-48-57-034.png, image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10755) Multithreaded loading Apps from zk statestore
[ https://issues.apache.org/jira/browse/YARN-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10755: Assignee: Bilwa S T > Multithreaded loading Apps from zk statestore > - > > Key: YARN-10755 > URL: https://issues.apache.org/jira/browse/YARN-10755 > Project: Hadoop YARN > Issue Type: Improvement > Environment: version: hadooop-2.8.5 >Reporter: chaosju >Assignee: Bilwa S T >Priority: Major > Attachments: image-2021-04-27-12-55-18-710.png > > > In RM, we may be get a list of applications to be read from state store and > then divide the work of reading data associated with each app to multiple > threads. > I think its import to large clusters. > h2. Profile > Profile by TestZKRMStateStorePerf > Params: -appSize 2 -appattemptsize 2 -hostPort localhost:2181 > Profile Result: loadRMAppState stage cost is 5s. > Profile logs: > !image-2021-04-27-12-55-18-710.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM
[ https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340181#comment-17340181 ] Bilwa S T commented on YARN-9615: - [~pbacsko] No problem. I just want this to be merged before 3.3.1 release is done. Thanks > Add dispatcher metrics to RM > > > Key: YARN-9615 > URL: https://issues.apache.org/jira/browse/YARN-9615 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Qi Zhu >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9615.001.patch, YARN-9615.002.patch, > YARN-9615.003.patch, YARN-9615.004.patch, YARN-9615.005.patch, > YARN-9615.006.patch, YARN-9615.007.patch, YARN-9615.008.patch, > YARN-9615.009.patch, YARN-9615.010.patch, YARN-9615.011.patch, > YARN-9615.011.patch, YARN-9615.poc.patch, image-2021-03-04-10-35-10-626.png, > image-2021-03-04-10-36-12-441.png, screenshot-1.png > > > It'd be good to have counts/processing times for each event type in RM async > dispatcher and scheduler async dispatcher. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340172#comment-17340172 ] Bilwa S T commented on YARN-10642: -- Hi [~pbacsko] YARN-8995 is merged to branch-3.1 so we need to backport it to branch-3.1 as well. > Race condition: AsyncDispatcher can get stuck by the changes introduced in > YARN-8995 > > > Key: YARN-10642 > URL: https://issues.apache.org/jira/browse/YARN-10642 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.2.1 >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Critical > Fix For: 3.4.0, 3.3.1, 3.2.3 > > Attachments: MockForDeadLoop.java, YARN-10642-branch-3.2.001.patch, > YARN-10642-branch-3.2.002.patch, YARN-10642-branch-3.3.001.patch, > YARN-10642.001.patch, YARN-10642.002.patch, YARN-10642.003.patch, > YARN-10642.004.patch, YARN-10642.005.patch, deadloop.png, debugfornode.png, > put.png, take.png > > > In our cluster, ResouceManager stuck twice within twenty days. Yarn client > can't submit application. I got jstack info at second time, then found the > reason. > I analyze all the jstack, I found many thread stuck because can't get > LinkedBlockingQueue.putLock. (Note: Sorry for limited space , omit the > analytical process) > The reason is that one thread hold the putLock all the time, > printEventQueueDetails will called forEachRemaining, then hold putLock and > readLock. The AsyncDispatcher will stuck. > {code} > Thread 6526 (IPC Server handler 454 on default port 8030): > State: RUNNABLE > Blocked count: 29988 > Waited count: 2035029 > Stack: > > java.util.concurrent.LinkedBlockingQueue$LBQSpliterator.forEachRemaining(LinkedBlockingQueue.java:926) > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.printEventQueueDetails(AsyncDispatcher.java:270) > > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:295) > > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.handleProgress(DefaultAMSProcessor.java:408) > > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:215) > > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:432) > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1040) > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:958) > java.security.AccessController.doPrivileged(Native Method) > {code} > I analyze LinkedBlockingQueue's source code. I found forEachRemaining in > LinkedBlockingQueue.LBQSpliterator may stuck, when forEachRemaining and take > are called in different thread. > YARN-8995 introduce printEventQueueDetails method, > "eventQueue.stream().collect" will called forEachRemaining method. > Let's see why? "put.png" shows that how to put("a"), "take.png" shows that > how to take()。Specical Node: The removed Node will point itself for help gc!!! > The key point code is in forEachRemaining, we see LBQSpliterator use > forEachRemaining to visit all Node. But when got item value from Node, will > release the lock. If at this time, take() will be called. > The variable 'p' in forEachRemaining may point a Node which point itself, > then forEachRemaining will be in dead loop. You can see it in "deadloop.png" > Let's see a simple uni-test, Let's forEachRemaining called more slow than > take, the problem will reproduction。uni-test is MockForDeadLoop.java. > I debug MockForDeadLoop.java, and see a Node point itself. You
[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM
[ https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340015#comment-17340015 ] Bilwa S T commented on YARN-9615: - [~pbacsko] can you please backport it to branch-3.3 ? > Add dispatcher metrics to RM > > > Key: YARN-9615 > URL: https://issues.apache.org/jira/browse/YARN-9615 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Qi Zhu >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9615.001.patch, YARN-9615.002.patch, > YARN-9615.003.patch, YARN-9615.004.patch, YARN-9615.005.patch, > YARN-9615.006.patch, YARN-9615.007.patch, YARN-9615.008.patch, > YARN-9615.009.patch, YARN-9615.010.patch, YARN-9615.011.patch, > YARN-9615.011.patch, YARN-9615.poc.patch, image-2021-03-04-10-35-10-626.png, > image-2021-03-04-10-36-12-441.png, screenshot-1.png > > > It'd be good to have counts/processing times for each event type in RM async > dispatcher and scheduler async dispatcher. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10745) Change Log level from info to debug for few logs and remove unnecessary debuglog checks
[ https://issues.apache.org/jira/browse/YARN-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339557#comment-17339557 ] Bilwa S T commented on YARN-10745: -- +1 (Non-binding) on YARN-10745.004.patch [~brahmareddy] [~ebadger] can you please help in review and committ? > Change Log level from info to debug for few logs and remove unnecessary > debuglog checks > --- > > Key: YARN-10745 > URL: https://issues.apache.org/jira/browse/YARN-10745 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10745.001.patch, YARN-10745.002.patch, > YARN-10745.003.patch, YARN-10745.004.patch > > > Change the info log level to debug for few logs so that the load on the > logger decreases in large cluster and improves the performance. > Remove the unnecessary isDebugEnabled() checks for printing strings without > any string concatenation -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10745) Change Log level from info to debug for few logs and remove unnecessary debuglog checks
[ https://issues.apache.org/jira/browse/YARN-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335401#comment-17335401 ] Bilwa S T commented on YARN-10745: -- Hi [~dmmkr] Thanks for the patch. I have few minor comments * In ProportionalCapacityPreemptionPolicy.java LOG.isDebugEnabled() check can be removed for below log {quote} LOG.debug("Send to scheduler: in app={} " + "#containers-to-be-preemptionCandidates={}", appAttemptId, e.getValue().size()); {quote} * Why do we need LOG.isDebugEnabled() check in AsyncDispatcher.java Few suggestions * In NodesListManager.java we can print below log only if either of the sets is not empty {quote} LOG.info("hostsReader include:\{" +StringUtils.join(",", hostsReader.getHosts()) +"} exclude:{" + StringUtils.join(",", hostsReader.getExcludedHosts()) + "}"); {quote} > Change Log level from info to debug for few logs and remove unnecessary > debuglog checks > --- > > Key: YARN-10745 > URL: https://issues.apache.org/jira/browse/YARN-10745 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10745.001.patch > > > Change the info log level to debug for few logs so that the load on the > logger decreases in large cluster and improves the performance. > Remove the unnecessary isDebugEnabled() checks for printing strings without > any string concatenation -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for
[ https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10670: - Description: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml :: yarn.resourcemanager.opportunistic-container-allocation.enabled true # Set this in NM[s]yarn-site.xml ::: yarn.nodemanager.opportunistic-containers-max-queue-length 30 Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Application Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for Guaranateed container. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. was: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml :: yarn.resourcemanager.opportunistic-container-allocation.enabled true # Set this in NM[s]yarn-site.xml ::: yarn.nodemanager.opportunistic-containers-max-queue-length 30 Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Attempt recovered after RM restartApplication Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container De-queued to meet NM queuing limits. [2021-02-09 22:11:48.441]Container terminated before launch. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. > YARN: Opportunistic Container : : In distributed shell job if containers are > killed then application is failed. But in this case as containers are killed > to make room for guaranteed containers which is not correct to fail an > application > > > Key: YARN-10670 > URL: https://issues.apache.org/jira/browse/YARN-10670 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.1.1 >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > > Preconditions: > # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed > # Set the below parameters in RM yarn-site.xml :: > yarn.resourcemanager.opportunistic-container-allocation.enabled > true > > # Set this in NM[s]yarn-site.xml ::: > yarn.nodemanager.opportunistic-containers-max-queue-length > 30 > > > Test Steps: > Job Command : : yarn > org.apache.hadoop.yarn.applications.distributedshell.Client jar > HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar > -shell_command sleep -shell_args 20 -num_containers 20 -container_type > OPPORTUNISTIC -promote_opportunistic_after_start > Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics > message > {noformat} > Application Failure: desired = 20, completed = 20, allocated = 20, failed = > 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for > Guaranateed container. > {noformat} > Expected Result: Distributed Shell Yarn Job should not fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for
[ https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10670: - Description: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml :: yarn.resourcemanager.opportunistic-container-allocation.enabled true # Set this in NM[s]yarn-site.xml ::: yarn.nodemanager.opportunistic-containers-max-queue-length 30 Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Attempt recovered after RM restartApplication Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container De-queued to meet NM queuing limits. [2021-02-09 22:11:48.441]Container terminated before launch. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. was: Preconditions: # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed # Set the below parameters in RM yarn-site.xml :: yarn.resourcemanager.opportunistic-container-allocation.enabled true # Set this in NM[s]yarn-site.xml ::: yarn.nodemanager.opportunistic-containers-max-queue-length 30 Test Steps: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message {noformat} Attempt recovered after RM restartApplication Failure: desired = 20, completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 22:11:48.440]Container De-queued to meet NM queuing limits. [2021-02-09 22:11:48.441]Container terminated before launch. {noformat} Expected Result: Distributed Shell Yarn Job should not fail. > YARN: Opportunistic Container : : In distributed shell job if containers are > killed then application is failed. But in this case as containers are killed > to make room for guaranteed containers which is not correct to fail an > application > > > Key: YARN-10670 > URL: https://issues.apache.org/jira/browse/YARN-10670 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.1.1 >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > > Preconditions: > # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed > # Set the below parameters in RM yarn-site.xml :: > yarn.resourcemanager.opportunistic-container-allocation.enabled > true > > # Set this in NM[s]yarn-site.xml ::: > yarn.nodemanager.opportunistic-containers-max-queue-length > 30 > > > Test Steps: > Job Command : : yarn > org.apache.hadoop.yarn.applications.distributedshell.Client jar > HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar > -shell_command sleep -shell_args 20 -num_containers 20 -container_type > OPPORTUNISTIC -promote_opportunistic_after_start > Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics > message > {noformat} > Attempt recovered after RM restartApplication Failure: desired = 20, > completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 > 22:11:48.440]Container De-queued to meet NM queuing limits. > [2021-02-09 22:11:48.441]Container terminated before launch. > {noformat} > Expected Result: Distributed Shell Yarn Job should not fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332275#comment-17332275 ] Bilwa S T commented on YARN-10691: -- cc [~epayne] [~jbrennan] > DominantResourceCalculator isInvalidDivisor should consider only countable > resource types > - > > Key: YARN-10691 > URL: https://issues.apache.org/jira/browse/YARN-10691 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10691.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330908#comment-17330908 ] Bilwa S T edited comment on YARN-10732 at 4/23/21, 4:51 PM: [~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to RUNNING state was added. Basically it was added if user had stopped queue by mistake then he can start it back. With your patch queue cannot be transitioned to RUNNING state. Can you please explain in detail about your use case? was (Author: bilwast): [~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to RUNNING state was added. Basically it was added if user had stopped queue by mistake then he can start it back. With your patch queue cannot be transitioned to RUNNING state. > Disallow restarting a queue while it is in DRAINING state on CS > reinitialization > > > Key: YARN-10732 > URL: https://issues.apache.org/jira/browse/YARN-10732 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10732.001.patch > > > CSConfigValidator#validateQueueHierarchy does not check a state where the old > queue is in DRAINING state but the new queue state is RUNNING. User should > wait until a queue is fully stopped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330908#comment-17330908 ] Bilwa S T commented on YARN-10732: -- [~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to RUNNING state was added. Basically it was added if user had stopped queue by mistake then he can start it back. With your patch queue cannot be transitioned to RUNNING state. > Disallow restarting a queue while it is in DRAINING state on CS > reinitialization > > > Key: YARN-10732 > URL: https://issues.apache.org/jira/browse/YARN-10732 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10732.001.patch > > > CSConfigValidator#validateQueueHierarchy does not check a state where the old > queue is in DRAINING state but the new queue state is RUNNING. User should > wait until a queue is fully stopped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10691: - Attachment: YARN-10691.001.patch > DominantResourceCalculator isInvalidDivisor should consider only countable > resource types > - > > Key: YARN-10691 > URL: https://issues.apache.org/jira/browse/YARN-10691 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10691.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10469) The accuracy of the percentage values in the same chart on the YARN 'Cluster OverView' page are inconsistent
[ https://issues.apache.org/jira/browse/YARN-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325828#comment-17325828 ] Bilwa S T commented on YARN-10469: -- Hi [~tangzhankun] PR is merged for this jira. Can we resolve this? > The accuracy of the percentage values in the same chart on the YARN 'Cluster > OverView' page are inconsistent > > > Key: YARN-10469 > URL: https://issues.apache.org/jira/browse/YARN-10469 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-ui-v2 >Affects Versions: 3.1.1 >Reporter: akiyamaneko >Priority: Minor > Fix For: 3.3.0 > > Attachments: reproduce.png > > > The accuracy of the percentage values in the same chart on the YARN 'Cluster > OverView' page are inconsistent, show as secreenshot in the attachment.. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.v2.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > image-2021-04-05-16-48-57-034.png, image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, YARN-10725-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314662#comment-17314662 ] Bilwa S T commented on YARN-10725: -- Hi [~brahmareddy] As discussed i have attached patch for this to backport to branch-3.3 . Please do check. Thanks > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10120-branch-3.3.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312088#comment-17312088 ] Bilwa S T commented on YARN-10120: -- [~brahmareddy] I have raised YARN-10725 to backport to branch-3.3 > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Fix For: 3.4.0 > > Attachments: YARN-10120-YARN-7402.patch, > YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, > YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, > YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
[jira] [Created] (YARN-10725) Backport YARN-10120 to branch-3.3
Bilwa S T created YARN-10725: Summary: Backport YARN-10120 to branch-3.3 Key: YARN-10725 URL: https://issues.apache.org/jira/browse/YARN-10725 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309378#comment-17309378 ] Bilwa S T commented on YARN-9606: - [~brahmareddy] This can be backported once YARN-10120 is merged to branch-3.3 > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306824#comment-17306824 ] Bilwa S T commented on YARN-10697: -- [~Jim_Brennan] I have changed method name. Please check updated patch. Thanks > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.003.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305380#comment-17305380 ] Bilwa S T commented on YARN-10697: -- Hi [~Jim_Brennan] I have attached .002 patch with latest changes. Please review. Thanks > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.002.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: (was: YARN-10697.002.patch) > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.002.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304629#comment-17304629 ] Bilwa S T commented on YARN-10697: -- Thanks [~Jim_Brennan] [~jhung] for your comments. I basically added changes in Resource#toString so that its easier for user to read. I agree its not correct to add it there as its called from many other places. So can we introduce a new method in Resource.java which can print it in MB|GB|TB? > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303123#comment-17303123 ] Bilwa S T commented on YARN-10697: -- [~epayne] [~jbrennan] can you please take a look at this? > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.001.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303120#comment-17303120 ] Bilwa S T commented on YARN-10697: -- In YARN-10251 in if case they removed multiplying by BYTES_IN_MB whereas in else case it was missed. !image-2021-03-17-11-30-57-216.png! > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: image-2021-03-17-11-30-57-216.png > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
Bilwa S T created YARN-10697: Summary: Resources are displayed in bytes in UI for schedulers other than capacity Key: YARN-10697 URL: https://issues.apache.org/jira/browse/YARN-10697 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T Resources.newInstance expects MB as memory whereas in MetricsOverviewTable passes resources in bytes . Also we should display memory in GB for better readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10691: - Summary: DominantResourceCalculator isInvalidDivisor should consider only countable resource types (was: DominantResourceCalculator divide and ratio methods should consider only countable resource types) > DominantResourceCalculator isInvalidDivisor should consider only countable > resource types > - > > Key: YARN-10691 > URL: https://issues.apache.org/jira/browse/YARN-10691 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301434#comment-17301434 ] Bilwa S T commented on YARN-10588: -- Thanks [~Jim_Brennan] and [~epayne] for review comments. I have raised YARN-10691 to handle above issue. I think this one can be merged. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10691) DominantResourceCalculator divide and ratio methods should consider only countable resource types
Bilwa S T created YARN-10691: Summary: DominantResourceCalculator divide and ratio methods should consider only countable resource types Key: YARN-10691 URL: https://issues.apache.org/jira/browse/YARN-10691 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298009#comment-17298009 ] Bilwa S T commented on YARN-10588: -- Hi [~Jim_Brennan] can you please take a look at this Jira when you get time? Thanks > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297905#comment-17297905 ] Bilwa S T commented on YARN-10120: -- Hi [~brahmareddy] looks like this didn't get merged to branch-3.3 . Can you please backport it ? Thanks > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Fix For: 3.3.0, 3.4.0 > > Attachments: YARN-10120-YARN-7402.patch, > YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, > YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, > YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.Con
[jira] [Assigned] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room fo
[ https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10670: Assignee: Bilwa S T > YARN: Opportunistic Container : : In distributed shell job if containers are > killed then application is failed. But in this case as containers are killed > to make room for guaranteed containers which is not correct to fail an > application > > > Key: YARN-10670 > URL: https://issues.apache.org/jira/browse/YARN-10670 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.1.1 >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > > Preconditions: > # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed > # Set the below parameters in RM:: > yarn.resourcemanager.opportunistic-container-allocation.enabled > true > > # Set this in NM[s]: > yarn.nodemanager.opportunistic-containers-max-queue-length > 30 > > > Test Steps: > Job Command : : yarn > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar > -shell_command sleep -shell_args 20 -num_containers 20 -container_type > OPPORTUNISTIC > Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics > message > {noformat} > Attempt recovered after RM restartApplication Failure: desired = 20, > completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 > 22:11:48.440]Container De-queued to meet NM queuing limits. > [2021-02-09 22:11:48.441]Container terminated before launch. > {noformat} > Expected Result: Distributed Shell Yarn Job should not fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10668) [DS] Disable distributed scheduling when client doesn't configure scheduler address as amrmproxy address
Bilwa S T created YARN-10668: Summary: [DS] Disable distributed scheduling when client doesn't configure scheduler address as amrmproxy address Key: YARN-10668 URL: https://issues.apache.org/jira/browse/YARN-10668 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T In distributed scheduling setup if client wants to submit application with normal client conf ie scheduler address not same as amrmproxyaddress , then application fails with Invalid AMRMToken. So i think distributed scheduling should be disabled and job should be executed with opportunistic containers enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294998#comment-17294998 ] Bilwa S T commented on YARN-10588: -- [~epayne] I have updated the patch. Please take a look at it. Thanks > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10588: - Attachment: YARN-10588.004.patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10667) The current logic only sets the subdirectory of nm-aux-services to 700, but does not set nm-aux-services dir.
[ https://issues.apache.org/jira/browse/YARN-10667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10667: Assignee: Bilwa S T > The current logic only sets the subdirectory of nm-aux-services to 700, but > does not set nm-aux-services dir. > -- > > Key: YARN-10667 > URL: https://issues.apache.org/jira/browse/YARN-10667 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > Attachments: Permission 755.PNG > > > Current code logic only sets the subdirectory of nm-aux-services to 700, but > does not set nm-aux-services dir. > The permissions of some files and directories in the yarn deployment node are > 755. > !Permission 755.PNG! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9017) PlacementRule order is not maintained in CS
[ https://issues.apache.org/jira/browse/YARN-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286510#comment-17286510 ] Bilwa S T commented on YARN-9017: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > PlacementRule order is not maintained in CS > --- > > Key: YARN-9017 > URL: https://issues.apache.org/jira/browse/YARN-9017 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9017.001.patch, YARN-9017.002.patch, > YARN-9017.003.patch > > > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286507#comment-17286507 ] Bilwa S T commented on YARN-9606: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9301) Too many InvalidStateTransitionException with SLS
[ https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286509#comment-17286509 ] Bilwa S T commented on YARN-9301: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Too many InvalidStateTransitionException with SLS > - > > Key: YARN-9301 > URL: https://issues.apache.org/jira/browse/YARN-9301 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Major > Labels: simulator > Fix For: 3.4.0 > > Attachments: YARN-9301-001.patch, YARN-9301.002.patch > > > Too many InvalidStateTransistionExcetion > {noformat} > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > LAUNCHED at RUNNING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:745) > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED > on container container_1550059705491_0067_01_01 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value
[ https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286506#comment-17286506 ] Bilwa S T commented on YARN-8942: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > PriorityBasedRouterPolicy throws exception if all sub-cluster weights have > negative value > - > > Key: YARN-8942 > URL: https://issues.apache.org/jira/browse/YARN-8942 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Akshay Agarwal >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-8942.001.patch, YARN-8942.002.patch > > > In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to > negative values* it is throwing exception while running a job. > Ideally it should handle the negative priority as well according to the home > sub cluster selection process of the policy. > *Exception Details:* > {code:java} > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable > to insert the ApplicationId application_1540356760422_0015 into the > FederationStateStore > at > org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > Caused by: > org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException: > Missing SubCluster Id information. Please try again by specifying Subcluster > Id information. > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247) > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160) > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65) > at > org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159) > at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source) > at > org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413) > ... 11 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[jira] [Commented] (YARN-10359) Log container report only if list is not empty
[ https://issues.apache.org/jira/browse/YARN-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286504#comment-17286504 ] Bilwa S T commented on YARN-10359: -- [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Log container report only if list is not empty > -- > > Key: YARN-10359 > URL: https://issues.apache.org/jira/browse/YARN-10359 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-10359.001.patch, YARN-10359.002.patch > > > In NodeStatusUpdaterImpl print log only if containerReports list is not empty > {code:java} > if (containerReports != null) { > LOG.info("Registering with RM using containers :" + containerReports); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10364) Absolute Resource [memory=0] is considered as Percentage config type
[ https://issues.apache.org/jira/browse/YARN-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286503#comment-17286503 ] Bilwa S T commented on YARN-10364: -- [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Absolute Resource [memory=0] is considered as Percentage config type > > > Key: YARN-10364 > URL: https://issues.apache.org/jira/browse/YARN-10364 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Prabhu Joseph >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10364.001.patch, YARN-10364.002.patch, > YARN-10364.003.patch > > > Absolute Resource [memory=0] is considered as Percentage config type. This > causes failure while converting queues from Percentage to Absolute Resources > automatically. > *Repro:* > 1. Queue A = 100% and child queues Queue A.B = 0%, A.C=100% > 2. While converting above to absolute resource automatically, capacity of > queue A = [memory=], A.B = [memory=0] > This fails with below as A is considered as Absolute Resource whereas B is > considered as Percentage config type. > {code} > 2020-07-23 09:36:40,499 WARN > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: > CapacityScheduler configuration validation failed:java.io.IOException: Failed > to re-init queues : Parent queue 'root.A' and child queue 'root.A.B' should > use either percentage based capacityconfiguration or absolute resource > together for label: > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286446#comment-17286446 ] Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM: - [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} was (Author: bilwast): [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286446#comment-17286446 ] Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM: - [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} was (Author: bilwast): [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What do you think?** {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286446#comment-17286446 ] Bilwa S T commented on YARN-10588: -- [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What do you think?** {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10634) The config parameter "mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting Opportunistic containers in YARN job
[ https://issues.apache.org/jira/browse/YARN-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10634: Assignee: Bilwa S T > The config parameter "mapreduce.job.num-opportunistic-maps-percent" is > confusing when requesting Opportunistic containers in YARN job > - > > Key: YARN-10634 > URL: https://issues.apache.org/jira/browse/YARN-10634 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Minor > > Execute the below job by Passing this config > -Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the > number of containers to be launched as Opportunistic, not in % of the total > mappers requested , i think this configuration name should be modified > accordingly and also {color:#de350b}the same gets printed in AM logs{color} > Job Command: hadoop jar > HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar > pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" > 20 99 > In AM logs this message is displayed. it should be {color:#de350b}20 , not > 20% {color}? > “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the > mappers{color} will be scheduled using OPPORTUNISTIC containers | > RMContainerAllocator.java:257” > Job Command: hadoop jar > HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar > pi > {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 20 > 99 > In AM logs this message is displayed. It should be {color:#de350b}100, not > 100%{color} ? > 2021-02-10 20:28:16,016 | INFO | main | {color:#de350b}100% of the > mapper{color}s will be scheduled using OPPORTUNISTIC containers | > RMContainerAllocator.java:257 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8047) RMWebApp make external class pluggable
[ https://issues.apache.org/jira/browse/YARN-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286357#comment-17286357 ] Bilwa S T commented on YARN-8047: - Hi [~brahma] can you please cherry-pick this Jira to 3.3.1 ? Thanks > RMWebApp make external class pluggable > -- > > Key: YARN-8047 > URL: https://issues.apache.org/jira/browse/YARN-8047 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-8047-001.patch, YARN-8047-002.patch, > YARN-8047-003.patch, YARN-8047.004.patch, YARN-8047.005.patch, > YARN-8047.006.patch > > > JIra should make sure we should be able to plugin webservices and web pages > of scheduler in Resourcemanager > * RMWebApp allow to bind external classes > * RMController allow to plugin scheduler classes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285697#comment-17285697 ] Bilwa S T commented on YARN-10258: -- Thank you [~gb.ana...@gmail.com] for your contribution. Patch LGTM. there are few checkstyle issues. Please fix. Resubmitting patch to trigger build again > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Attachment: YARN-10258-002.patch > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Target Version/s: (was: 3.1.3) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Fix Version/s: (was: 3.1.3) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Comment: was deleted (was: Thank you [~gb.ana...@gmail.com] for working on this. Looks there are some checkstyle issues. other than that patch LGTM) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Fix For: 3.1.3 > > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285691#comment-17285691 ] Bilwa S T commented on YARN-10258: -- Thank you [~gb.ana...@gmail.com] for working on this. Looks there are some checkstyle issues. other than that patch LGTM > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Fix For: 3.1.3 > > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282988#comment-17282988 ] Bilwa S T edited comment on YARN-10588 at 2/12/21, 8:23 AM: [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. Correct me if i am wrong was (Author: bilwast): [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282988#comment-17282988 ] Bilwa S T commented on YARN-10588: -- [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-9927: --- Assignee: Bilwa S T > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Assignee: Bilwa S T >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282328#comment-17282328 ] Bilwa S T commented on YARN-10588: -- Hi [~epayne] I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282328#comment-17282328 ] Bilwa S T edited comment on YARN-10588 at 2/10/21, 9:19 AM: Thanks [~epayne] [~Jim_Brennan] for taking a look at this issue. I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. was (Author: bilwast): Hi [~epayne] I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org