Unable to stop containers when keep-containers-across-application-attempt is enabled
Hi, I am currently trying to keep the containers from the previous attempts alive across attempts so that when AM restarts happen, the processing containers stay intact. I am achieving this using the keep-containers-across-application-attempt flag. For my use case, I do need to stop the processing container from the previous attempt in case of certain metadata changes (e.g. work assignments). When the new AM tries to stop the existing container, the NMClientAsync throws the following exception org.apache.hadoop.yarn.exceptions.YarnException: Container > container_1606797336059_0004_01_02 is neither started nor scheduled to > start > at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:45) > ~[hadoop-yarn-common-2.7.1.jar:?] > at > org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl.stopContainerAsync(NMClientAsyncImpl.java:235) > ~[hadoop-yarn-client-2.7.1.jar:? > I am guessing the NMClient is unaware of this container since it didn't start it in the first place. I tried fetching the status through the NMClient which is successful and returns running. My guess is the list of containers that NMClient tracks doesn't have the containers that belonged to previous attempts and hence there is no way to stop it. Any help is appreciated. Thanks, Bharath
Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/75/ [Nov 30, 2020 6:53:54 AM] (noreply) HDFS-15699 Remove lz4 references in vcxproj (#2498) [Nov 30, 2020 11:06:52 PM] (noreply) HDFS-15677. TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk. (#2503) ERROR: File 'out/email-report.txt' does not exist - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/133/ [Nov 30, 2020 2:19:24 AM] (Akira Ajisaka) YARN-10498. Fix typo in CapacityScheduler Markdown document (#2484) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10508) Protobuf Mesage Incompatibility Detector
junwen yang created YARN-10508: -- Summary: Protobuf Mesage Incompatibility Detector Key: YARN-10508 URL: https://issues.apache.org/jira/browse/YARN-10508 Project: Hadoop YARN Issue Type: Bug Reporter: junwen yang Attachments: yarn_proto_incompatibility.txt Regarding the issue YARN-5632 caused by the incompatibility of protobuf message, we have created a static checker which keeps track of the proto file change, and detect potential incompatibility: # Add/delete required field, which is the case reported in HBASE-25238. # The tag number of a field has been changed, as described in HDFS-9788. Also, the [protobuf guidelines |https://developers.google.com/protocol-buffers/docs/proto]suggests _each field in the message definition has a *unique number*. These numbers are used to identify your fields in the [message binary format|https://developers.google.com/protocol-buffers/docs/encoding], and should not be changed once your message type is in use_. # A required field has been changed to optional, or an optional field has been changed to required. According to the guidelines in [protobuf official website|https://developers.google.com/protocol-buffers/docs/proto], _*Required Is Forever* You should be very careful about marking fields as {{required}}. If at some point you wish to stop writing or sending a required field, it will be problematic to change the field to an optional field - old readers will consider messages without this field to be incomplete and may reject or drop them unintentionally. You should consider writing application-specific custom validation routines for your buffers instead._ We have applied our checker on the frequently maintained HDFS versions: rel/release-2.6.4, rel/release-2.7.2, rel/release-2.8.0, rel/release-2.9.0, rel/release-2.10.0, rel/release-3.0.0, rel/release-3.1.0, rel/release-3.2.0, rel/release-3.3.0, and we found 28 potential problems as attached. The results reported by our checker got confirmation from developers in HBASE, which can be found here HBASE-25340. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml
Peter Bacsko created YARN-10507: --- Summary: Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml Key: YARN-10507 URL: https://issues.apache.org/jira/browse/YARN-10507 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when it converts the placement rules. However, we also support having the JSON inlined inside {{capacity-scheduler.xml}}. Add a command line switch so that we can choose the desired output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/ No changes -1 overall The following subsystems voted -1: pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml Failed junit tests : hadoop.hdfs.tools.TestDFSAdminWithHA hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics hadoop.yarn.server.router.webapp.TestRouterWebServicesREST hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.tools.dynamometer.TestDynamometerInfra hadoop.tools.dynamometer.TestDynamometerInfra cc: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-compile-cc-root.txt [48K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-compile-javac-root.txt [568K] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-checkstyle-root.txt [16M] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/pathlen.txt [12K] pylint: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-patch-pylint.txt [60K] shellcheck: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-patch-shelldocs.txt [44K] whitespace: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/whitespace-eol.txt [13M] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/whitespace-tabs.txt [2.0M] xml: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/xml.txt [24K] javadoc: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/diff-javadoc-javadoc-root.txt [2.0M] unit: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [352K] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt [36K] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt [16K] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/patch-unit-hadoop-tools_hadoop-dynamometer_hadoop-dynamometer-infra.txt [8.0K] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/341/artifact/out/patch-unit-hadoop-tools_hadoop-dynamometer.txt [24K] Powered by Apache Yetus 0.12.0 https://yetus.apache.org - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10506) Update queue creation logic to use weight mode and allow the flexible static/dynamic creation
Benjamin Teke created YARN-10506: Summary: Update queue creation logic to use weight mode and allow the flexible static/dynamic creation Key: YARN-10506 URL: https://issues.apache.org/jira/browse/YARN-10506 Project: Hadoop YARN Issue Type: Sub-task Reporter: Benjamin Teke The queue creation logic should be updated to use weight mode and support the flexible creation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10505) Extend the maximum-capacity property to react to weight mode changes
Benjamin Teke created YARN-10505: Summary: Extend the maximum-capacity property to react to weight mode changes Key: YARN-10505 URL: https://issues.apache.org/jira/browse/YARN-10505 Project: Hadoop YARN Issue Type: Sub-task Reporter: Benjamin Teke The property root.users.maximum-capacity could mean the following things: * Relative Percentage: maximum capacity relative to its parent. If it’s set to 50, then it means that the capacity is capped with respect to the parent. * Absolute Percentage: maximum capacity expressed as a percentage of the overall cluster capacity. * Percentages of different resource types: this would refer to vCores, memory, GPU, etc... Similarly to the single percentage value, this could either mean percentage of the parent or percentage of the overall cluster resource. * Absolute limit: explicit definition of vCores and memory like vcores=20, memory-mb=16384. Note that Fair Scheduler supports the following settings: * Single percentage (absolute) * Two percentages (absolute) * Absolute resources It is recommended that all three formats are supported for maximum-capacity after introducing weight mode. The final form of the configuration for example could look like this: root.users.maximum-capacity = 100% - single percentage root.users.maximum-capacity = (vcores=100%, memory-mb=100%) - two percentages root.users.maximum-capacity = (vcores=10, memory-mb=1mb) - absolute -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[VOTE] Release Apache Hadoop 3.2.2 - RC3
Hi folks, The release candidate (RC3) for Hadoop-3.2.2 is available now. There are 22 commits[1] differences between RC3 and RC2[2]. The RC3 is available at: http://people.apache.org/~hexiaoqiao/hadoop-3.2.2-RC3 The RC3 tag in github is here: https://github.com/apache/hadoop/tree/release-3.2.2-RC3 The maven artifacts are staged at: https://repository.apache.org/content/repositories/orgapachehadoop-1291 You can find my public key at: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS or https://people.apache.org/keys/committer/hexiaoqiao.asc directly. Please try the release and vote. Thanks, He Xiaoqiao [1] https://github.com/apache/hadoop/compare/release-3.2.2-RC2...release-3.2.2-RC3 [2] https://lists.apache.org/thread.html/r606fff445847bdb85bd60c5a73b2fb1f0433ee31b18c456a2231fcec%40%3Chdfs-dev.hadoop.apache.org%3E [3] https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335948
Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/ No changes -1 overall The following subsystems voted -1: asflicense hadolint jshint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-tools/hadoop-azure/src/config/checkstyle.xml hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml Failed junit tests : hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat hadoop.hdfs.server.federation.resolver.order.TestLocalResolver hadoop.hdfs.server.federation.router.TestRouterQuota hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer hadoop.yarn.server.resourcemanager.TestClientRMService hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter hadoop.tools.TestDistCpSystem hadoop.resourceestimator.service.TestResourceEstimatorService hadoop.resourceestimator.solver.impl.TestLpSolver jshint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-patch-jshint.txt [208K] cc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-compile-javac-root.txt [456K] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/pathlen.txt [12K] pylint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-patch-pylint.txt [60K] shellcheck: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-patch-shellcheck.txt [56K] shelldocs: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/whitespace-eol.txt [12M] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/whitespace-tabs.txt [1.3M] xml: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/xml.txt [4.0K] javadoc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/diff-javadoc-javadoc-root.txt [20K] unit: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [272K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt [12K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt [36K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt [120K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/132/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase-tests.txt [20K]