[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM
[ https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140841#comment-17140841 ] Jim Brennan commented on YARN-9809: --- Thanks for the patch [~ebadger]! Overall I think the design and code look good. Here are some comments: MockNM - line 196 - Isn't this loop that removes completedContainers a no-op? {noformat} ArrayList completedContainers = new ArrayList(); status.setContainersStatuses( new ArrayList(containerStats.values())); for (ContainerId cid : completedContainers) { containerStats.remove(cid); } {noformat} MockRM - This code is repeated in a lot of tests. Maybe we could add a function somewhere that does this so we can just pass getMockNodeStatus() instead? TestAbstractYarnScheduler, TestCapacityScheduler, testFairScheduler, TestFifoScheduler, TestNMExpiry, TestNMReconnect, TestResourceManager, TestRMAppLogAggregationStatus, TestRMNodeTransitions, TestRMWebServicesNodes, TestSchedulerHealth, {noformat} NodeStatus mockNodeStatus = mock(NodeStatus.class); NodeHealthStatus mockNodeHealthStatus = mock(NodeHealthStatus.class); when(mockNodeStatus.getNodeHealthStatus()).thenReturn(mockNodeHealthStatus); when(mockNodeHealthStatus.getIsNodeHealthy()).thenReturn(true); {noformat} RMAppManager - This looks like an accidental edit: {noformat} // Escape YarnServerCommonServiceProtossequences {noformat} RMNodeImpl - line 894 Don't we have to deal with the possibility that nodeStatus is null here? Seems like that is a possibilty. I think null nodeStatus should be treated as healthy. The RegisterNodeManagerRequest constructors that pass null is what made me think this is necessary? {noformat} NodeStatus nodeStatus = startEvent.getNodeStatus(); RMNodeStatusEvent rmNodeStatusEvent = new RMNodeStatusEvent(nodeId, nodeStatus); NodeHealthStatus nodeHealthStatus = updateRMNodeFromStatusEvents(rmNode, rmNodeStatusEvent); NodeState nodeState = null; if (nodeHealthStatus.getIsNodeHealthy()) { {noformat} - In the case where the node is unhealthy, can we just call reportNodeUnusable() instead of {noformat} rmNode.context.getDispatcher().getEventHandler().handle( new NodesListManagerEvent( NodesListManagerEventType.NODE_UNUSABLE, rmNode)); //Update the metrics rmNode.updateMetricsForDeactivatedNode(NodeState.RUNNING, NodeState.UNHEALTHY); {noformat} TestRMNodeTransitions - Maybe add a testAddUnhealthy here? TimedHealthReporterService - Do we need to be concerned about someone who might have their own implementation of TimedHealthReporterService? Should we maintain a constructor that takes two args and passes null for runBeforeStartup? > NMs should supply a health status when registering with RM > -- > > Key: YARN-9809 > URL: https://issues.apache.org/jira/browse/YARN-9809 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9809.001.patch, YARN-9809.002.patch, > YARN-9809.003.patch, YARN-9809.004.patch > > > Currently if the NM registers with the RM and it is unhealthy, it can be > scheduled many containers before the first heartbeat. After the first > heartbeat, the RM will mark the NM as unhealthy and kill all of the > containers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10229) [Federation] Client should be able to submit application to RM directly using normal client conf
[ https://issues.apache.org/jira/browse/YARN-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140747#comment-17140747 ] Bilwa S T commented on YARN-10229: -- Hi [~elgoiri] Tested on CentOS Linux release 7.6.1810 Below are test steps: 1. Federation Setup with two Subclusters with 3 nodes each 2. Submitted application to Router with client having below conf {code:java} yarn.resourcemanger.scheduler.address localhost:8049 yarn.resourcemanager.address {routerip}:8050 {code} 3. Submitted application to RM directly with below conf {code:java} yarn.resourcemanger.scheduler.address rmip:8030 yarn.resourcemanager.address {rmip}:8032 {code} > [Federation] Client should be able to submit application to RM directly using > normal client conf > > > Key: YARN-10229 > URL: https://issues.apache.org/jira/browse/YARN-10229 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy, federation >Affects Versions: 3.1.1 >Reporter: JohnsonGuo >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10229.001.patch > > > Scenario: When enable the yarn federation feature with multi yarn clusters, > one can submit their job to yarn-router by *modified* their client > configuration with yarn router address. > But if one still wants to submit their jobs via the original client (before > enable federation) to RM directly, it will encounter the AMRMToken exception. > That means once enable federation ,if some one want to submit job, they have > to modify the client conf. > > one possible solution for this Scenario is: > In NodeManger, when the client ApplicationMaster request comes: > * get the client job.xml from HDFS "". > * parse the "yarn.resourcemanager.scheduler.address" parameter in job.xml > * if the value of the parameter is "localhost:8049"(AMRM address),then do > the AMRMToken valid process > * if the value of the parameter is "rm:port"(rm address),then skip the > AMRMToken valid process > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140743#comment-17140743 ] Hadoop QA commented on YARN-9930: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} branch-3.3 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 16s{color} | {color:green} branch-3.3 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} branch-3.3 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} branch-3.3 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} branch-3.3 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} branch-3.3 passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 42s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} branch-3.3 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 742 unchanged - 1 fixed = 743 total (was 743) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 53s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}159m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestReservations | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26186/artifact/out/Dockerfile | | JIRA Issue | YARN-9930 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006073/YARN-9930-branch-3.3.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ff65e66c59e2 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Person
[jira] [Updated] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9930: --- Attachment: YARN-9930-branch-3.3.001.patch > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > YARN-9930-branch-3.3.001.patch, screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reopened YARN-9930: I'll backport this to branch-3.3 (I believe it should apply cleanly). > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140526#comment-17140526 ] Hudson commented on YARN-9930: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18365 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18365/]) YARN-9930. Support max running app logic for CapacityScheduler. (snemeth: rev 469841446f921f3da5bbd96cf83b3a808dde8084) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueStateManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCSMaxRunningAppsEnforcer.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerMaxParallelApps.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestReservationSystem.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueState.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140516#comment-17140516 ] Szilard Nemeth commented on YARN-9930: -- Thanks [~pbacsko] for this patch, great job. LGTM, committed to trunk. Thanks [~adam.antal] and [~bteke] for the useful reviews. Resolving jira. [~pbacsko] If you want to backport this to other branches, we can re-open this jira. > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9930: - Fix Version/s: 3.4.0 > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10319) Record Last N Scheduler Activities from ActivitiesManager
[ https://issues.apache.org/jira/browse/YARN-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140500#comment-17140500 ] Hadoop QA commented on YARN-10319: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} markdownlint {color} | {color:blue} 0m 0s{color} | {color:blue} markdownlint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 24s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site no findbugs output file (findbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 6 new + 21 unchanged - 0 fixed = 27 total (was 21) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 3s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 26s{color} | {color:blue} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site has no data from findbugs {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 58s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 1s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}209m 57s{color} | {color:black} {color} | \\ \\
[jira] [Commented] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140496#comment-17140496 ] Hadoop QA commented on YARN-10279: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 40s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 91m 20s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26185/artifact/out/Dockerfile | | JIRA Issue | YARN-10279 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006050/YARN-10279.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cd21f78ab88a 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 9821b94c946 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/26185/testReport/ | | Max. process+thread count | 829 (vs. uli
[jira] [Commented] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140498#comment-17140498 ] Hadoop QA commented on YARN-10279: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 38s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 56s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26184/artifact/out/Dockerfile | | JIRA Issue | YARN-10279 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006050/YARN-10279.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 32920c1024bf 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 9821b94c946 | | Defau
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140489#comment-17140489 ] Adam Antal commented on YARN-9930: -- Thanks for the effort on pushing this through [~pbacsko], +1 > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10278) CapacityScheduler test framework ProportionalCapacityPreemptionPolicyMockFramework need some review
[ https://issues.apache.org/jira/browse/YARN-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140459#comment-17140459 ] Hadoop QA commented on YARN-10278: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 24m 55s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 20 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 55s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 81 new + 70 unchanged - 42 fixed = 151 total (was 112) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyMockFramework | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyForReservedContainers | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption | | | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26181/artifact/out/Dockerfile | | JIRA Issue | YARN-10278 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006043/YARN-10278.001.patch | | O
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140453#comment-17140453 ] Benjamin Teke commented on YARN-9930: - [~pbacsko] the latest patch LGTM, +1 (non-binding). > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10324) Fetch data from NodeManager may case read timeout when disk is busy
Yao Guangdong created YARN-10324: Summary: Fetch data from NodeManager may case read timeout when disk is busy Key: YARN-10324 URL: https://issues.apache.org/jira/browse/YARN-10324 Project: Hadoop YARN Issue Type: Improvement Components: auxservices Affects Versions: 3.2.1, 2.7.0 Reporter: Yao Guangdong With the cluster size become more and more big.The cost time on Reduce fetch Map's result from NodeManager become more and more long.We often see the WARN logs in the reduce's logs as follow. {quote}2020-06-19 15:43:15,522 WARN [fetcher#8] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to TX-196-168-211.com:13562 with 5 map outputs java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:434) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:400) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:271) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:330) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198) {quote} We check the NodeManager server find that the disk IO util and connections became very high when the read timeout happened.We analyze that if we have 20,000 maps and 1,000 reduces which will make NodeManager generate 20 million times IO stream operate in the shuffle phase.If the reduce fetch data size is very small from map output files.Which make the disk IO util become very high in big cluster.Then read timeout happened frequently.The application finished time become longer. We find ShuffleHandler have IndexCache for cache file.out.index file.Then we want to change the small IO to big IO which can reduce the small disk IO times. So we try to cache all the small file data(file.out) in memory when the first fetch request come.Then the others fetch request only need read data from memory avoid disk IO operation.After we cache data to memory we find the read timeout disappeared. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140391#comment-17140391 ] Peter Bacsko commented on YARN-9930: [~snemeth] could you review & commit the patch? I think now it's in good shape. [~adam.antal] & [~bteke] you can also check and give non-binding votes if you have time. > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-005.patch, > YARN-9930-006.patch, YARN-9930-POC01.patch, YARN-9930-POC02.patch, > YARN-9930-POC03.patch, YARN-9930-POC04.patch, YARN-9930-POC05.patch, > screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10279: --- Attachment: YARN-10279.003.patch > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch, YARN-10279.003.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10279: --- Attachment: (was: YARN-10279.002.patch) > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10279: --- Attachment: YARN-10279.002.patch > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch, YARN-10279.002.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10279: --- Attachment: (was: YARN-10279.002.patch) > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140368#comment-17140368 ] Hadoop QA commented on YARN-10279: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} YARN-10279 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-10279 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13006048/YARN-10279.002.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/26183/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch, YARN-10279.002.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations
[ https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10279: --- Attachment: YARN-10279.002.patch > Avoid unnecessary QueueMappingEntity creations > -- > > Key: YARN-10279 > URL: https://issues.apache.org/jira/browse/YARN-10279 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Hudáky Márton Gyula >Priority: Minor > Attachments: YARN-10279.001.patch, YARN-10279.002.patch > > > In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes > we create new instances of QueueMappingEntity class. In some cases we simply > copy the already received class, so we just duplicate it, which is > unnecessary since the class is immutable. > This is just a minor improvement, probably doesn't have much impact, but > still puts some unnecessary load on GC. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10319) Record Last N Scheduler Activities from ActivitiesManager
[ https://issues.apache.org/jira/browse/YARN-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10319: - Attachment: (was: YARN-10319-002.patch) > Record Last N Scheduler Activities from ActivitiesManager > - > > Key: YARN-10319 > URL: https://issues.apache.org/jira/browse/YARN-10319 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Labels: activitiesmanager > Attachments: Screen Shot 2020-06-18 at 1.26.31 PM.png, > YARN-10319-001-WIP.patch, YARN-10319-002.patch > > > ActivitiesManager records a call flow for a given nodeId or a last call flow. > This is useful when debugging the issue live where the user queries with > right nodeId. But capturing last N scheduler activities during the issue > period can help to debug the issue offline. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10319) Record Last N Scheduler Activities from ActivitiesManager
[ https://issues.apache.org/jira/browse/YARN-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10319: - Attachment: YARN-10319-002.patch > Record Last N Scheduler Activities from ActivitiesManager > - > > Key: YARN-10319 > URL: https://issues.apache.org/jira/browse/YARN-10319 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Labels: activitiesmanager > Attachments: Screen Shot 2020-06-18 at 1.26.31 PM.png, > YARN-10319-001-WIP.patch, YARN-10319-002.patch > > > ActivitiesManager records a call flow for a given nodeId or a last call flow. > This is useful when debugging the issue live where the user queries with > right nodeId. But capturing last N scheduler activities during the issue > period can help to debug the issue offline. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10278) CapacityScheduler test framework ProportionalCapacityPreemptionPolicyMockFramework need some review
[ https://issues.apache.org/jira/browse/YARN-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10278: -- Attachment: YARN-10278.001.patch > CapacityScheduler test framework > ProportionalCapacityPreemptionPolicyMockFramework need some review > --- > > Key: YARN-10278 > URL: https://issues.apache.org/jira/browse/YARN-10278 > Project: Hadoop YARN > Issue Type: Task >Reporter: Gergely Pollak >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-10278.001.patch > > > This test framework class mocks a bit too heavily, and simulates CS internal > behaviour with the mock methods over a point it is reasonably maintainable, > any internal change in CS is a major headscratch. > A lot of tests depend on this class, so we should approach it carefully, but > I think it's wroth to examine this class if it can be made a bit more > resilient to changes, and easier to maintain. Or at least document it better. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10305) Lost system-credentials when restarting RM
[ https://issues.apache.org/jira/browse/YARN-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140308#comment-17140308 ] kyungwan nam commented on YARN-10305: - Hi. [~eyang] [~prabhujoseph] I have seen this problem solved with this patch. Could you take a look at this patch? Thanks > Lost system-credentials when restarting RM > -- > > Key: YARN-10305 > URL: https://issues.apache.org/jira/browse/YARN-10305 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10305.001.patch > > > System-credentials introduced in YARN-2704, it makes it to keep the > long-running apps. > I’ve met a situation where system-credentials lost when restarting RM. > Since then, if an app’s AM is stopped, restarting AM will be failed because > NMs do not have HDFS delegation token which is needed for resource > localization. > The app has a couple of delegation token including timeline-server token and > HDFS delegation token. > When restarting RM, RM will request a new HDFS delegation token for an app > that was submitted long ago. (It's fixed by YARN-5098) > But, If an app has a couple of delegation token and an exception occur in the > token processed first, the next tokens are not processed. > I think that’s why lost system-credentials. > Here are RM’s logs at the time of restarting RM. > {code} > 2020-05-19 14:25:05,712 WARN security.DelegationTokenRenewer > (DelegationTokenRenewer.java:handleDTRenewerAppRecoverEvent(955)) - Unable to > add the application to the delegation token renewer on recovery. > java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, > Service: 10.1.1.1:8190, Ident: (TIMELINE_DELEGATION_TOKEN owner=test-admin, > renewer=yarn, realUser=yarn, issueDate=1586136363258, maxDate=1587000363258, > sequenceNumber=2193, masterKeyId=340) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:503) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleDTRenewerAppRecoverEvent(DelegationTokenRenewer.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:79) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:912) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: HTTP status [403], message > [org.apache.hadoop.security.token.SecretManager$InvalidToken: yarn tried to > renew an expired token (TIMELINE_DELEGATION_TOKEN owner=test-admin, > renewer=yarn, realUser=yarn, issueDate=1586136363258, maxDate=1587000363258, > sequenceNumber=2193, masterKeyId=340) max expiration date: 2020-04-16 > 10:26:03,258+0900 currentTime: 2020-05-19 14:25:05,700+0900] > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:319) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:235) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:437) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:247) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:227) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientRetryOpForOperateDelegationToken.run(TimelineConnector.java:431) > at > org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:334) > at > org.apache.hadoop.yarn.client.api.impl.TimelineConnector.operateDelegationToken(TimelineConnector.java:218) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.renewDelegationToken(TimelineClientImpl.java:250) > at > org.apache.hadoop.yarn.security.client.Tim