Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1619/ [Jun 24, 2024, 9:41:11 AM] (github) HADOOP-19194:Add test to find unshaded dependencies in the aws sdk (#6865) [Jun 24, 2024, 4:34:52 PM] (github) HDFS-13603: do not propagate ExecutionException and add maxRetries limit to NameNode edek cache warmup (#6774) -1 overall The following subsystems voted -1: blanks compile golang hadolint pathlen spotbugs unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) shadedclient unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml spotbugs : module:hadoop-hdfs-project/hadoop-hdfs-httpfs Redundant nullcheck of xAttrs, which is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:[line 1373] spotbugs : module:hadoop-yarn-project/hadoop-yarn org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] spotbugs : module:hadoop-hdfs-project Redundant nullcheck of xAttrs, which is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:[line 1373] spotbugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] spotbugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] spotbugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] spotbugs : module:hadoop-yarn-project org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] spotbugs : module:root Redundant nullcheck of xAttrs, which is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:is known to be non-null in org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) Redundant null check at HttpFSFileSystem.java:[line 1373] org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may return null, but is declared @Nonnull At ServiceScheduler.java:is declared @Nonnull At ServiceScheduler.java:[line 555] Failed junit tests : hadoop.hdfs.server.blockmanagement.TestDatanodeManager hadoop.fs.http.client.TestHttpFSFileSystemLocalFileSystem hadoop.hdfs.server.federation.security.TestRouterSecurityManager compile: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1619/artifact/out/patch-compile-root.txt [684K] cc:
Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/697/ [Jun 24, 2024, 9:41:11 AM] (github) HADOOP-19194:Add test to find unshaded dependencies in the aws sdk (#6865) [Jun 24, 2024, 4:34:52 PM] (github) HDFS-13603: do not propagate ExecutionException and add maxRetries limit to NameNode edek cache warmup (#6774) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-11702) Fix Yarn over allocating containers
Syed Shameerur Rahman created YARN-11702: Summary: Fix Yarn over allocating containers Key: YARN-11702 URL: https://issues.apache.org/jira/browse/YARN-11702 Project: Hadoop YARN Issue Type: Bug Reporter: Syed Shameerur Rahman Assignee: Syed Shameerur Rahman *Replication Steps:* Apache Spark 3.5.1 and Apache Hadoop 3.3.6 (Capacity Scheduler) {code:java} spark.executor.memory 1024M spark.driver.memory 2048M spark.executor.cores 1 spark.executor.instances 20 spark.dynamicAllocation.enabled false{code} Based on the setup, there should be 20 spark executors, but from the ResourceManager (RM) UI, i could see that 32 executors were allocated and 12 of them were released in seconds. On analyzing the Spark ApplicationMaster (AM) logs, The following logs were observed. {code:java} 4/06/24 14:10:14 INFO YarnAllocator: Will request 20 executor container(s) for ResourceProfile Id: 0, each with 1 core(s) and 1408 MB memory. with custom resources: 24/06/24 14:10:14 INFO YarnAllocator: Received 8 containers from YARN, launching executors on 8 of them. 24/06/24 14:10:14 INFO YarnAllocator: Received 8 containers from YARN, launching executors on 8 of them. 24/06/24 14:10:14 INFO YarnAllocator: Received 12 containers from YARN, launching executors on 4 of them. 24/06/24 14:10:17 INFO YarnAllocator: Received 4 containers from YARN, launching executors on 0 of them. {code} It was clear for the logs that extra allocated 12 containers are being ignored from Spark side. Inorder to debug this further, additional log lines were added to [AppSchedulingInfo|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java#L427] class in increment and decrement of container request to expose additional information about the request. {code:java} 2024-06-24 14:10:14,075 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (IPC Server handler 42 on default port 8030): Updates PendingContainers: 0 Incremented by: 20 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,077 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 20 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,077 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 19 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,111 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 18 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,112 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 17 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,112 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 16 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,113 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 15 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,113 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 14 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0, containerToUpdate=null} for: appattempt_1719234929152_0004_01 2024-06-24 14:10:14,113 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo (SchedulerEventDispatcher:Event Processor): Allocate Updates PendingContainers: 13 Decremented by: 1 SchedulerRequestKey{priority=0, allocationRequestId=0,
Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/ No changes -1 overall The following subsystems voted -1: asflicense hadolint mvnsite pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.fs.TestFileUtil hadoop.contrib.bkjournal.TestBootstrapStandbyWithBKJM hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.TestFileLengthOnClusterRestart hadoop.hdfs.TestDFSInotifyEventInputStream hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes hadoop.hdfs.server.federation.router.TestRouterQuota hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat hadoop.hdfs.server.federation.resolver.order.TestLocalResolver hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver hadoop.contrib.bkjournal.TestBootstrapStandbyWithBKJM hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.mapreduce.lib.input.TestLineRecordReader hadoop.mapred.TestLineRecordReader hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter hadoop.resourceestimator.service.TestResourceEstimatorService hadoop.resourceestimator.solver.impl.TestLpSolver hadoop.yarn.sls.TestSLSRunner hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceAllocator hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceHandlerImpl hadoop.yarn.server.resourcemanager.TestClientRMService hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker cc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-compile-javac-root.txt [488K] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-checkstyle-root.txt [14M] hadolint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-patch-hadolint.txt [4.0K] mvnsite: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-mvnsite-root.txt [572K] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/pathlen.txt [12K] pylint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-patch-pylint.txt [20K] shellcheck: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/diff-patch-shellcheck.txt [72K] whitespace: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/whitespace-eol.txt [12M] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/whitespace-tabs.txt [1.3M] javadoc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-javadoc-root.txt [36K] unit: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [220K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [464K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt [36K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt [24K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt [104K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt [20K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1434/artifact/out/patch-unit-hadoop-tools_ha