[jira] [Created] (YARN-10955) Add health check mechanism to improve troubleshooting skills for RM
Tao Yang created YARN-10955: --- Summary: Add health check mechanism to improve troubleshooting skills for RM Key: YARN-10955 URL: https://issues.apache.org/jira/browse/YARN-10955 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Tao Yang Assignee: Tao Yang RM is the most complex component in YARN with many basic or core services including RPC servers, event dispatchers, HTTP server, core scheduler, state managers etc., and some of them depends on other basic components like ZooKeeper, HDFS. Currently we may have to find some suspicious traces from many related metrics and tremendous logs while encountering an unclear issue, hope to locate the root cause of the problem. For example, some applications keep staying in NEW_SAVING state, which can be caused by lost of ZooKeeper connections or jam in event dispatcher, the useful traces is sinking in many metrics and logs. That's not easy to figure out what happened even for some experts, let alone common users. So I propose to add a common health check mechanism to improve troubleshooting skills for RM, in my general thought, we can * add a HealthReporter interface as follows: {code:java} public interface HealthReporter { HealthReport getHealthReport(); } {code} HealthReport can have some generic fields like isHealthy(boolean), updateTime(long), diagnostics(string) and keyMetrics(Map). * make some key services implement HealthReporter interface and generate health report via evaluating the internal state. * add HealthCheckerService which can manage and monitor all reportable services, support checking and fetching health reports periodically and manually (can be triggered by REST API), publishing metrics and logs as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10928) Support default queue properties of capacity scheduler to simplify configuration management
[ https://issues.apache.org/jira/browse/YARN-10928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang resolved YARN-10928. - Fix Version/s: 3.4.0 Resolution: Fixed Committed to trunk. Thanks [~Weihao Zheng] for the contribution! > Support default queue properties of capacity scheduler to simplify > configuration management > --- > > Key: YARN-10928 > URL: https://issues.apache.org/jira/browse/YARN-10928 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Weihao Zheng >Assignee: Weihao Zheng >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > There are many user cases that one user owns many queues in his > organization's cluster for different business usages in practice. These > queues often share the same properties, such as minimum-user-limit-percent > and user-limit-factor. Users have to write one property for every queue they > use if they want to use customized these shared properties. Adding default > queue properties for these cases will simplify capacity scheduler's > configuration file and make it easy to adjust queue's common properties. > > CHANGES: > Add two properties as queue's default value in capacity scheduler's > configuration: > * {{yarn.scheduler.capacity.minimum-user-limit-percent}} > * {{yarn.scheduler.capacity.user-limit-factor}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10903) Too many "Failed to accept allocation proposal" because of wrong Headroom check for DRF
[ https://issues.apache.org/jira/browse/YARN-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang resolved YARN-10903. - Fix Version/s: 3.4.0 Resolution: Fixed Committed to trunk already. Thanks [~jackwangcs] for the contribution and [~epayne] for the review. > Too many "Failed to accept allocation proposal" because of wrong Headroom > check for DRF > --- > > Key: YARN-10903 > URL: https://issues.apache.org/jira/browse/YARN-10903 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: jackwangcs >Assignee: jackwangcs >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > The headroom check in `ParentQueue.canAssign` and > `RegularContainerAllocator#checkHeadroom` does not consider the DRF cases. > This will cause a lot of "Failed to accept allocation proposal" when a queue > is near-fully used. > In the log: > Headroom: memory:256, vCores:729 > Request: memory:56320, vCores:5 > clusterResource: memory:673966080, vCores:110494 > If use the DRF, then > {code:java} > Resources.greaterThanOrEqual(rc, clusterResource, Resources.add( > currentResourceLimits.getHeadroom(), resourceCouldBeUnReserved), > required); {code} > will be true but in fact we can not allocate resources to the request due to > the max limit(no enough memory). > {code:java} > 2021-07-21 23:49:39,012 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt: > showRequests: application=application_1626747977559_95859 > headRoom= currentConsumption=0 > 2021-07-21 23:49:39,012 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator: > Request={AllocationRequestId: -1, Priority: 1, Capability: vCores:5>, # Containers: 19, Location: *, Relax Locality: true, Execution > Type Request: null, Node Label Expression: prod-best-effort-node} > . > 2021-07-21 23:49:39,013 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Try to commit allocation proposal=New > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.ResourceCommitRequest: > ALLOCATED=[(Application=appattempt_1626747977559_95859_01; > Node=:8041; Resource=)] > 2021-07-21 23:49:39,013 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.UsersManager: > userLimit is fetched. userLimit=, > userSpecificUserLimit=, > schedulingMode=RESPECT_PARTITION_EXCLUSIVITY, partition=prod-best-effort-node > 2021-07-21 23:49:39,013 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > Headroom calculation for user x: userLimit= > queueMaxAvailRes= consumed= > partition=prod-best-effort-node > 2021-07-21 23:49:39,013 DEBUG > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue: > Used resource= exceeded maxResourceLimit of the > queue = > 2021-07-21 23:49:39,013 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10854) Support marking inactive node as untracked without configured include path
Tao Yang created YARN-10854: --- Summary: Support marking inactive node as untracked without configured include path Key: YARN-10854 URL: https://issues.apache.org/jira/browse/YARN-10854 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Tao Yang Assignee: Tao Yang Currently inactive nodes which have been decommissioned/shutdown/lost for a while(specified expiration time defined via {{yarn.resourcemanager.node-removal-untracked.timeout-ms}}, 60 seconds by default) and not exist in both include and exclude files can be marked as untracked nodes and can be removed from RM state. It's very useful when auto-scaling is enabled in elastic cloud environment, which can avoid unlimited increase of inactive nodes (mostly are decommissioned nodes). But this only works when the include path is configured, mismatched for most of our cloud environments without configured white list of nodes, which can lead to easily control for the auto-scaling of nodes without further security requirements. So I propose to support marking inactive node as untracked without configured include path, to be compatible with the former versions, we can add a switch config for this. Any thoughts/suggestions/feedbacks are welcome! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
Tao Yang created YARN-10059: --- Summary: Final states of failed-to-localize containers are not recorded in NM state store Key: YARN-10059 URL: https://issues.apache.org/jira/browse/YARN-10059 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Tao Yang Assignee: Tao Yang Currently we found an issue that many localizers of completed containers were launched and exhausted memory/cpu of that machine after NM restarted, these containers were all failed and completed when localizing on a non-existed local directory which is caused by another problem, but their final states weren't recorded in NM state store. The process flow of a fail-to-localize container is as follow: {noformat} ResourceLocalizationService$LocalizerRunner#run -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> LOCALIZATION_FAILED upon RESOURCE_FAILED dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES -> ResourceLocalizationService#handleCleanupContainerResources handle CLEANUP_CONTAINER_RESOURCES dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP -> ContainerImpl$LocalizationFailedToDoneTransition#transition handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP {noformat} There's no update for state store in this flow now, which is required to avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-9803) NPE while accessing Scheduler UI
[ https://issues.apache.org/jira/browse/YARN-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang resolved YARN-9803. Resolution: Duplicate Hi, [~yifan.stan]. This is a duplicate of YARN-9685, closing it as duplicate. > NPE while accessing Scheduler UI > > > Key: YARN-9803 > URL: https://issues.apache.org/jira/browse/YARN-9803 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Xie YiFan >Assignee: Xie YiFan >Priority: Major > Attachments: YARN-9803-branch-3.1.1.001.patch > > > The same with what described in YARN-4624 > Scenario: > === > if not configure all queue's capacity to nodelabel even the value is 0, start > cluster and access capacityscheduler page. > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderQueueCapacityInfo(CapacitySchedulerPage.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderLeafQueueInfoWithPartition(CapacitySchedulerPage.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.render(CapacitySchedulerPage.java:97) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:243) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at > org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117) > at > org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$LI.__(Hamlet.java:7709) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$QueueBlock.render(CapacitySchedulerPage.java:342) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:243) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at > org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117) > at > org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$LI.__(Hamlet.java:7709) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$QueuesBlock.render(CapacitySchedulerPage.java:513) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:243) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at > org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:216) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.scheduler(RmController.java:86) > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9799) TestRMAppTransitions#testAppFinishedFinished fails intermittently
Tao Yang created YARN-9799: -- Summary: TestRMAppTransitions#testAppFinishedFinished fails intermittently Key: YARN-9799 URL: https://issues.apache.org/jira/browse/YARN-9799 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Tao Yang Assignee: Tao Yang Found intermittent failure of TestRMAppTransitions#testAppFinishedFinished in YARN-9664 jenkins report, the cause is that the assertion which will make sure dispatcher has handled APP_COMPLETED event but not wait, we need to add {{rmDispatcher.await()}} before that assertion like others in this class to fix this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9798) ApplicationMasterServiceTestBase#testRepeatedFinishApplicationMaster fails intermittently
Tao Yang created YARN-9798: -- Summary: ApplicationMasterServiceTestBase#testRepeatedFinishApplicationMaster fails intermittently Key: YARN-9798 URL: https://issues.apache.org/jira/browse/YARN-9798 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Tao Yang Assignee: Tao Yang Found intermittent failure of ApplicationMasterServiceTestBase#testRepeatedFinishApplicationMaster in YARN-9714 jenkins report, the cause is that the assertion which will make sure dispatcher has handled UNREGISTERED event but not wait until all events in dispatcher are handled, we need to add {{rm.drainEvents()}} before that assertion to fix this issue. Failure info: {noformat} [ERROR] testRepeatedFinishApplicationMaster(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterServiceCapacity) Time elapsed: 0.559 s <<< FAILURE! java.lang.AssertionError: Expecting only one event expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterServiceTestBase.testRepeatedFinishApplicationMaster(ApplicationMasterServiceTestBase.java:385) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) {noformat} Standard output: {noformat} 2019-08-29 06:59:54,458 ERROR [AsyncDispatcher event handler] resourcemanager.ResourceManager (ResourceManager.java:handle(1088)) - Error in handling event type REGISTERED for applicationAttempt appattempt_1567061994047_0001_01 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:276) at org.apache.hadoop.yarn.event.DrainDispatcher$2.handle(DrainDispatcher.java:91) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1679) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1658) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:914) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1086) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1067) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:200) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterServiceTestBase$CountingDispatcher.dispatch(ApplicationMasterServiceTestBase.java:401) at org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:76) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) at
[jira] [Created] (YARN-9716) AM container might leak
Tao Yang created YARN-9716: -- Summary: AM container might leak Key: YARN-9716 URL: https://issues.apache.org/jira/browse/YARN-9716 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.3.0 Reporter: Tao Yang Assignee: Tao Yang There is a risk that AM container might leak when NM exits unexpected meanwhile AM container is localizing if AM expiry interval (conf-key: yarn.am.liveness-monitor.expiry-interval-ms) is less than NM expiry interval (conf-key: yarn.nm.liveness-monitor.expiry-interval-ms). RMAppAttempt state changes as follows: {noformat} LAUNCHED/RUNNING – event:EXPIRED(FinalSavingTransition) --> FINAL_SAVING – event:ATTEMPT_UPDATE_SAVED(FinalStateSavedTransition / ExpiredTransition: send AMLauncherEventType.CLEANUP ) --> FAILED {noformat} AMLauncherEventType.CLEANUP will be handled by AMLauncher#cleanup which internally call ContainerManagementProtocol#stopContainer to stop AM container via communicating with NM, if NM can't be connected, it just skip it without any logs. I think in this case we can complete the AM container in scheduler when failed to stop it, so that it will have a chance to be stopped when NM reconnects with RM. Hope to hear your thoughts? Thank you! -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9714) Memory leaks after RM transitioned to standby
Tao Yang created YARN-9714: -- Summary: Memory leaks after RM transitioned to standby Key: YARN-9714 URL: https://issues.apache.org/jira/browse/YARN-9714 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tao Yang Assignee: Tao Yang Recently RM full GC happened in one of our clusters, after investigating the dump memory and jstack, I found two places in RM may cause memory leaks after RM transitioned to standby: # Release cache cleanup timer in AbstractYarnScheduler never be canceled. # ZooKeeper connection in ZKRMStateStore never be closed. To solve those leaks, we should close the connection or cancel the timer when services are stopping. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9687) Queue headroom check may let unacceptable allocation off when using DominantResourceCalculator
Tao Yang created YARN-9687: -- Summary: Queue headroom check may let unacceptable allocation off when using DominantResourceCalculator Key: YARN-9687 URL: https://issues.apache.org/jira/browse/YARN-9687 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Currently queue headroom check in {{RegularContainerAllocator#checkHeadroom}} is using {{Resources#greaterThanOrEqual}} which internally compare resources by ratio, when using DominantResourceCalculator, it may let unacceptable allocations off in some scenarios. For example: cluster-resource=<10GB, 10 vcores> queue-headroom=<2GB, 4 vcores> required-resource=<3GB, 1 vcores> In this way, headroom ratio(0.4) is greater than the required ratio(0.3), so that allocations will be let off in scheduling process but will always be rejected when committing these proposals. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9686) Reduce visibility of blacklisted nodes information (only for current app attempt) to avoid the abuse of memory
Tao Yang created YARN-9686: -- Summary: Reduce visibility of blacklisted nodes information (only for current app attempt) to avoid the abuse of memory Key: YARN-9686 URL: https://issues.apache.org/jira/browse/YARN-9686 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tao Yang Assignee: Tao Yang Recently we found an issue that RM did a long GC and found many WARN logs(Ignoring Blacklists, blacklist size 1775 is more than failure threshold ratio 0.2000298023224 out of total usable nodes 1778) in RM log with a super high frequency about 3w+/s. The direct cause is that a few apps with a large attempts and many blacklisted nodes were requested frequently via REST API or WEB UI. For every single request, RM should allocate new memory for blacklisted nodes for many times(N * NUM_ATTETMPTS). Currently both AM(system) blacklisted nodes and app blacklisted nodes are transferred among app attempts and there are only one instance for each other, it's redundant and costly to travel all blacklisted nodes for every app attempt, so that I propose to get and show blacklisted nodes only for current app attempt to enhance performance and avoid the abuse of memory in some similar scenarios. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9685) NPE when rendering the info table of leaf queue in non-accessible partitions
Tao Yang created YARN-9685: -- Summary: NPE when rendering the info table of leaf queue in non-accessible partitions Key: YARN-9685 URL: https://issues.apache.org/jira/browse/YARN-9685 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang I found incomplete queue info shown on scheduler page and NPE in RM log when rendering the info table of leaf queue in non-accessible partitions. {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderQueueCapacityInfo(CapacitySchedulerPage.java:163) at org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.renderLeafQueueInfoWithPartition(CapacitySchedulerPage.java:108) at org.apache.hadoop.yarn.server.resourcemanager.webapp.CapacitySchedulerPage$LeafQueueInfoBlock.render(CapacitySchedulerPage.java:97) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:243) {noformat} The direct cause is that PartitionQueueCapacitiesInfo of leaf queues in non-accessible partitions are incomplete(part of fields are null such as configuredMinResource/configuredMaxResource/effectiveMinResource/effectiveMaxResource) but some places in CapacitySchedulerPage don't consider that. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9664) Improve response of scheduler/app activities for better understanding
Tao Yang created YARN-9664: -- Summary: Improve response of scheduler/app activities for better understanding Key: YARN-9664 URL: https://issues.apache.org/jira/browse/YARN-9664 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Yang Assignee: Tao Yang Currently some diagnostics are not easy enough to understand for common users, and I found some places still need to be improved such as no partition information and lacking of necessary activities in some places. This issue is to improve these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9658) UT failures in TestLeafQueue
Tao Yang created YARN-9658: -- Summary: UT failures in TestLeafQueue Key: YARN-9658 URL: https://issues.apache.org/jira/browse/YARN-9658 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Assignee: Tao Yang In ActivitiesManager, if there's no yarn configuration in mock RMContext, cleanup interval can't be initialized to 5 seconds by default, causing the cleanup thread keeps running repeatedly without interval which may bring problems to mockito framework, it caused OOM in this case, internally many throwable objects were generated by incomplete mock. Add a default value for ActivitiesManager#activitiesCleanupIntervalMs to avoid cleanup thread running repeatedly without interval. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9623) Auto adjust queue length of app activities to make sure activities on all nodes can be covered
Tao Yang created YARN-9623: -- Summary: Auto adjust queue length of app activities to make sure activities on all nodes can be covered Key: YARN-9623 URL: https://issues.apache.org/jira/browse/YARN-9623 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Yang Assignee: Tao Yang Currently we can use configuration entry "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to control max queue length of app activities, but in some scenarios , this configuration may need to be updated in a growing cluster. Moreover, it's better for users to ignore that conf therefor it should be auto adjusted internally. There are some differences among different scheduling modes: * multi-node placement disabled ** Heartbeat driven scheduling: max queue length of app activities should not less than the number of nodes, considering nodes can not be always in order, we should make some room for misorder, for example, we can guarantee that max queue length should not be less than 1.2 * numNodes ** Async scheduling: every async scheduling thread goes through all nodes in order, in this mode, we should guarantee that max queue length should be numThreads * numNodes. * multi-node placement enabled: activities on all nodes can be involved in a single app allocation, therefor there's no need to adjust for this mode. To sum up, we can adjust the max queue length of app activities like this: {code} int configuredMaxQueueLength; int maxQueueLength; serviceInit(){ ... configuredMaxQueueLength = ...; //read configured max queue length maxQueueLength = configuredMaxQueueLength; //take configured value as default } CleanupThread#run(){ ... if (multiNodeDisabled) { if (asyncSchedulingEnabled) { maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * numNodes); } else { maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes); } } } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9600) Support self-adaption width for columns of containers table on app attempt page
Tao Yang created YARN-9600: -- Summary: Support self-adaption width for columns of containers table on app attempt page Key: YARN-9600 URL: https://issues.apache.org/jira/browse/YARN-9600 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Tao Yang Assignee: Tao Yang Attachments: image-2019-06-04-16-45-49-359.png, image-2019-06-04-16-55-18-899.png When there are outstanding requests showing on app attempt page, the page will be automatically stretched horizontally, after that, columns of containers table can't fill the table and left two blank spaces in the leftmost and the rightmost of this table, as the following picture shows: !image-2019-06-04-16-45-49-359.png|width=647,height=231! We can add relative width style (width:100%) for containers table to make columns self-adaption. After doing that containers table show as follows: !image-2019-06-04-16-55-18-899.png|width=645,height=229! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9598) Make reservation work well when multi-node enabled
Tao Yang created YARN-9598: -- Summary: Make reservation work well when multi-node enabled Key: YARN-9598 URL: https://issues.apache.org/jira/browse/YARN-9598 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang This issue is to solve problems about reservation when multi-node enabled: # As discussed in YARN-9576, re-reservation proposal may be always generated on the same node and break the scheduling for this app and later apps. I think re-reservation in unnecessary and we can replace it with LOCALITY_SKIPPED to let scheduler have a chance to look up follow candidates for this app when multi-node enabled. # Scheduler iterates all nodes and try to allocate for reserved container in LeafQueue#allocateFromReservedContainer. Here there are two problems: ** The node of reserved container should be taken as candidates instead of all nodes when calling FiCaSchedulerApp#assignContainers, otherwise later scheduler may generate a reservation-fulfilled proposal on another node, which will always be rejected in FiCaScheduler#commonCheckContainerAllocation. ** Assignment returned by FiCaSchedulerApp#assignContainers could never be null even if it's just skipped, it will break the normal scheduling process for this leaf queue because of the if clause in LeafQueue#assignContainers: "if (null != assignment) \{ return assignment;}" # Nodes which have been reserved should be skipped when iterating candidates in RegularContainerAllocator#allocate, otherwise scheduler may generate allocation or reservation proposal on these node which will always be rejected in FiCaScheduler#commonCheckContainerAllocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9590) Improve incomplete and duplicate activities
Tao Yang created YARN-9590: -- Summary: Improve incomplete and duplicate activities Key: YARN-9590 URL: https://issues.apache.org/jira/browse/YARN-9590 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Yang Assignee: Tao Yang Currently some branches in scheduling process may generate incomplete or duplicate activities, we should fix them to make activities clean. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9580) Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers
Tao Yang created YARN-9580: -- Summary: Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers Key: YARN-9580 URL: https://issues.apache.org/jira/browse/YARN-9580 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang When transferring assignment from child queue to parent queue, fulfilled reservation information including fulfilledReservation and fulfilledReservedContainer in assignment is lost. When multi-nodes enabled, this lost can raise a problem that allocation proposal is generated but can't be accepted because there is a check for fulfilled reservation information in FiCaSchedulerApp#commonCheckContainerAllocation, this endless loop will always be there and the resource of the node can't be used anymore. In HB-driven scheduling mode, fulfilled reservation can be allocated via another calling stack: CapacityScheduler#allocateContainersToNode --> CapacityScheduler#allocateContainerOnSingleNode --> CapacityScheduler#allocateFromReservedContainer, in this way assignment can be generated by leaf queue and directly submitted, I think that's why we hardly find this problem before. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9578) Add limit option to control number of results for app activities REST API
Tao Yang created YARN-9578: -- Summary: Add limit option to control number of results for app activities REST API Key: YARN-9578 URL: https://issues.apache.org/jira/browse/YARN-9578 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Currently all completed activities of specified application in cache will be returned for application activities REST API. Most results may be redundant in some scenarios which only need a few latest results, for example, perhaps only one result is needed to be shown on UI for debugging. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9567) Add diagnostics for outstanding resource requests on app attempts page
Tao Yang created YARN-9567: -- Summary: Add diagnostics for outstanding resource requests on app attempts page Key: YARN-9567 URL: https://issues.apache.org/jira/browse/YARN-9567 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Attachments: no_diagnostic_at_first.png, show_diagnostics_after_requesting_app_activities_REST_API.png Currently on app attempt page we can see outstanding resource requests, it will be helpful for users to know why if we can join diagnostics of this app with these. Discussed with [~cheersyang], we can passively load diagnostics from cache of completed app activities instead of actively triggering which may bring uncontrollable risk. For example: (1) At first we can see no diagnostic in cache if app activities not triggered below the outstanding requests. !no_diagnostic_at_first.png! (2) After requesting the application activities REST API, we can see diagnostics now. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9539) Improve cleanup process of app activities and make some conditions configurable
Tao Yang created YARN-9539: -- Summary: Improve cleanup process of app activities and make some conditions configurable Key: YARN-9539 URL: https://issues.apache.org/jira/browse/YARN-9539 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Environment: [YARN-9050 Design doc #4.4|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.crdawajmm3a4] Reporter: Tao Yang Assignee: Tao Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9538) Document scheduler/app activities and REST APIs
Tao Yang created YARN-9538: -- Summary: Document scheduler/app activities and REST APIs Key: YARN-9538 URL: https://issues.apache.org/jira/browse/YARN-9538 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Tao Yang Assignee: Tao Yang Add documentation for scheduler/app activities in CapacityScheduler.md and ResourceManagerRest.md. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities
Tao Yang created YARN-9497: -- Summary: Support grouping by diagnostics for query results of scheduler and app activities Key: YARN-9497 URL: https://issues.apache.org/jira/browse/YARN-9497 Project: Hadoop YARN Issue Type: Sub-task Environment: [Design Doc #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr] Reporter: Tao Yang Assignee: Tao Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9495) Fix findbugs warnings in hadoop-yarn-server-resourcemanager module
Tao Yang created YARN-9495: -- Summary: Fix findbugs warnings in hadoop-yarn-server-resourcemanager module Key: YARN-9495 URL: https://issues.apache.org/jira/browse/YARN-9495 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Assignee: Tao Yang Attachments: image-2019-04-17-18-22-40-624.png Recently there are 2 extant Findbugs warnings in hadoop-yarn-server-resourcemanager module according to the jenkins report of YARN-9439/YARN-9440 as follow: !image-2019-04-17-18-22-40-624.png! They seems not expect null input for SettableFuture#set even through they are declared as Nullable in com.google.common.util.concurrent.SettableFuture#set, perhaps since findbugs can support javax.annotation.Nullable but not org.checkerframework.checker.nullness.qual.Nullable. I think we should exclude these two warnings through adding them into hadoop-yarn/dev-support/findbugs-exclude.xml. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities
Tao Yang created YARN-9489: -- Summary: Support filtering by request-priorities and allocation-request-ids for query results of app activities Key: YARN-9489 URL: https://issues.apache.org/jira/browse/YARN-9489 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Yang Assignee: Tao Yang [Design Doc #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9440) Improve diagnostics for scheduler and app activities
Tao Yang created YARN-9440: -- Summary: Improve diagnostics for scheduler and app activities Key: YARN-9440 URL: https://issues.apache.org/jira/browse/YARN-9440 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang [Design doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9439) Support asynchronized scheduling mode and multi-node lookup mechanism for app activities
Tao Yang created YARN-9439: -- Summary: Support asynchronized scheduling mode and multi-node lookup mechanism for app activities Key: YARN-9439 URL: https://issues.apache.org/jira/browse/YARN-9439 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang [Design doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9432) Excess reserved containers may exist for a long time after its request has been cancelled or satisfied when multi-nodes enabled
Tao Yang created YARN-9432: -- Summary: Excess reserved containers may exist for a long time after its request has been cancelled or satisfied when multi-nodes enabled Key: YARN-9432 URL: https://issues.apache.org/jira/browse/YARN-9432 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Reserved containers may change to be excess after its request has been cancelled or satisfied, excess reserved containers need to be unreserved quickly to release resource for others. For multi-nodes disabled scenario, excess reserved containers can be quickly released in next node heartbeat, the calling stack is CapacityScheduler#nodeUpdate --> CapacityScheduler#allocateContainersToNode --> CapacityScheduler#allocateContainerOnSingleNode. But for multi-nodes enabled scenario, excess reserved containers have chance to be released only in allocation process, key phase of the calling stack is LeafQueue#assignContainers --> LeafQueue#allocateFromReservedContainer. According to this, excess reserved containers may not be released until its queue has pending request and has chance to be allocated, and the worst is that excess reserved containers will never be released and keep holding resource if there is no additional pending request for this queue. To solve this problem, my opinion is to directly kill excess reserved containers when request is satisfied (in FiCaSchedulerApp#apply) or the allocation number of resource-requests/scheduling-requests is updated to be 0 (in SchedulerApplicationAttempt#updateResourceRequests / SchedulerApplicationAttempt#updateSchedulingRequests). Please feel free to give your suggestions. Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9423) Optimize AM launcher to avoid bottleneck when a large number of AM failover happen at the same time
Tao Yang created YARN-9423: -- Summary: Optimize AM launcher to avoid bottleneck when a large number of AM failover happen at the same time Key: YARN-9423 URL: https://issues.apache.org/jira/browse/YARN-9423 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang We have met a slow recovery for applications when many NM lost happen at the same time: # many NM shut down at the same time abnormally. # NM expired, then a large number of AM start failover. # AM containers are allocated but not launched for about half an hour. Among this slow recovery, all ApplicationMasterLauncher threads were calling cleanup for containers on these lost nodes and keep retrying to communicate with NM for 3 minutes(retry policy is configured in NMProxy#createNMProxy) even though RM had known these NM are lost and probably can't be connected for a long time. Meanwhile many AM cleanup and launch operations were still waiting in queue (ApplicationMasterLauncher#masterEvents). Obviously AM launch operations were blocked by cleanup operations which are wasting 3 minutes. As a result, AM failover can be a very slow journey. I think we can optimize AM launcher in two ways: # Modify type of ApplicationMasterLauncher#masterEvents from LinkedBlockingQueue to PriorityBlockingQueue, keep executing launch operations in front of cleanup operations. # Check node state first and skip cleanup AM containers on non-existent or unusable NM (because these NM probably can't be communicated for a long time) before communicating with NM in cleanup process(AMLauncher#cleanup). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
Tao Yang created YARN-9313: -- Summary: Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities Key: YARN-9313 URL: https://issues.apache.org/jira/browse/YARN-9313 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tao Yang Assignee: Tao Yang [Design doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9300) Lazy preemption should trigger an update on queue preemption metrics for CapacityScheduler
Tao Yang created YARN-9300: -- Summary: Lazy preemption should trigger an update on queue preemption metrics for CapacityScheduler Key: YARN-9300 URL: https://issues.apache.org/jira/browse/YARN-9300 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.2 Reporter: Tao Yang Assignee: Tao Yang Currently lazy preemption can't trigger an update on queue preemption metrics since the update is only called in CapacityScheduler#completedContainerInternal which is not the only way to be passed for all container completions. This issue plans to move this update to LeafQueue#completedContainer to trigger an update on queue preemption metrics for all container completions because of preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9050) Usability improvements for scheduler activities
Tao Yang created YARN-9050: -- Summary: Usability improvements for scheduler activities Key: YARN-9050 URL: https://issues.apache.org/jira/browse/YARN-9050 Project: Hadoop YARN Issue Type: Improvement Reporter: Tao Yang Assignee: Tao Yang Attachments: image-2018-11-23-16-46-38-138.png We have did some usability improvements for scheduler activities based YARN3.1 in our cluster as follows: 1. Not available for multi-thread asynchronous scheduling. App and node activites maybe confused when multiple scheduling threads record activites of different allocation processes in the same variables like appsAllocation and recordingNodesAllocation in ActivitiesManager. I think these variables should be thread-local to make activities clear between multiple threads. 2. Incomplete activites for multi-node lookup machanism, since ActivitiesLogger will skip recording through {{if (node == null || activitiesManager == null) return; }} when node is null which represents this allocation is for multi-nodes. We need support recording activities for multi-node lookup machanism. 3. Current app activites can not meet requirements of diagnostics, for example, we can know that node doesn't match request but hard to know why, especially when using placement constraints, it's difficult to make a detailed diagnosis manually. So I propose to improve the diagnoses of activites, add diagnosis for placement constraints check, update insufficient resource diagnosis with detailed info (like 'insufficient resource names:[memory-mb]') and so on. 4. Add more useful fields for app activities, in some scenarios we need to distinguish different requests but can't locate requests based on app activities info, there are some other fields can help to filter what we want such as allocation tags. We have added containerPriority, allocationRequestId and allocationTags fields in AppAllocation. 5. Filter app activities by key fields, sometimes the results of app activities is massive, it's hard to find what we want. We have support filter by allocation-tags to meet requirements from some apps, more over, we can take container-priority and allocation-request-id as candidates if necessary. 6. Aggragate app activities by diagnoses. For a single allocation process, activities still can be massive in a large cluster, we frequently want to know why request can't be allocated in cluster, it's hard to check every node manually in a large cluster, so that aggragation for app activities by diagnoses is neccessary to solve this trouble. We have added groupingType parameter for app-activities REST API for this, supports grouping by diagnositics and example like this: !image-2018-11-23-16-46-38-138.png! I think we can have a discuss about these points, useful improvements which can be accepted will be added into the patch. Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9043) Inter-queue preemption sometimes starves an underserved queue when using DominantResourceCalculator
Tao Yang created YARN-9043: -- Summary: Inter-queue preemption sometimes starves an underserved queue when using DominantResourceCalculator Key: YARN-9043 URL: https://issues.apache.org/jira/browse/YARN-9043 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.3.0 Reporter: Tao Yang Assignee: Tao Yang To reproduce this problem in UT, we can setup a cluster with resource <40,18> and create 3 queues and apps: * queue a: guaranteed=<10,10>, used=<6,10> by app1 * queue b: guaranteed=<20,6>, used=<20,8> by app2 * queue c: guaranteed=<10,2>, used=<0,0>, pending=<1,1> Queue c is an underserved queue, queue b overuses 2 cpu resource, we expect app2 in queue b can be preempted but nothing happens. This problem is related to Resources#greaterThan/lessThan, comparation between two resources is based on the resource/cluster-resource ratio inside DominantResourceCalculator#compare, in this way, the low weight resource may be ignored, for the scenario in UT, take comparation between ideal assgined resource and used resource: * cluster resource is <40,18> * ideal assigned resource of queue b is <20,6>, ideal-assigned-resource / cluster-resource = <20, 6> / <40, 18> = max(20/40, 6/18) = 0.5 * used resource of queue b is <20, 8>, used-resource / cluster-resource = <20, 8> / <40, 18> = max(20/40, 8/18) = 0.5 The results of {{Resources.greaterThan(rc, clusterResource, used, idealAssigned)}} will be false instead of true, and there are some other similar places have the same problem, so that preemption can't happen in current logic. To solve this problem, I propose to add ResourceCalculator#isAnyMajorResourceGreaterThan method, inside DominantResourceCalculator implements, it will compare every resource type between two resources and return true if any major resource types of left resource is greater than that of right resource, then replace Resources#greaterThan with it in some places of inter-queue preemption with this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9007) CS preemption monitor should only select GUARANTEED containers as candidates
Tao Yang created YARN-9007: -- Summary: CS preemption monitor should only select GUARANTEED containers as candidates Key: YARN-9007 URL: https://issues.apache.org/jira/browse/YARN-9007 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.1 Reporter: Tao Yang Assignee: Tao Yang Currently CS preemption monitor doesn't consider execution type of containers, so OPPORTUNISTIC containers maybe selected and killed without effect. In some scenario with OPPORTUNISTIC containers, not even preemption can't work properly to balance resources, but also some apps with OPPORTUNISTIC containers maybe effected and unable to work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
Tao Yang created YARN-8958: -- Summary: Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app Key: YARN-8958 URL: https://issues.apache.org/jira/browse/YARN-8958 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.1 Reporter: Tao Yang Assignee: Tao Yang We found a NPE in ClientRMService#getApplications when querying apps with specified queue. The cause is that there is one app which can't be found by calling RMContextImpl#getRMApps(is finished and swapped out of memory) but still can be queried from fair ordering policy. To reproduce schedulable entities leak in fair ordering policy: (1) create app1 and launch container1 on node1 (2) restart RM (3) remove app1 attempt, app1 is removed from the schedulable entities. (4) recover container1, then the state of contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder after container released, then app1 will be added back into schedulable entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. (5) remove app1 To solve this problem, we should make sure schedulableEntities can only be affected by add or remove app attempt, new entity should not be added into schedulableEntities by reordering process. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); schedulableEntities.add(schedulableEntity); } {code} Related codes above can be improved as follow to make sure only existent entity can be re-add into schedulableEntities. {code:java} protected void reorderSchedulableEntity(S schedulableEntity) { //remove, update comparable data, and reinsert to update position in order boolean exists = schedulableEntities.remove(schedulableEntity); updateSchedulingResourceUsage( schedulableEntity.getSchedulingResourceUsage()); if (exists) { schedulableEntities.add(schedulableEntity); } else { LOG.info("Skip reordering non-existent schedulable entity: " + schedulableEntity.getId()); } } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8945) Calculation of Maximum applications should respect specified and global maximum applications for absolute resource
Tao Yang created YARN-8945: -- Summary: Calculation of Maximum applications should respect specified and global maximum applications for absolute resource Key: YARN-8945 URL: https://issues.apache.org/jira/browse/YARN-8945 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently maximum applications is expected to be calculated as follow according to priority when using percentage based capacity: (1) equals specified maximum applications for queues (2) equals global maximum applications (3) calculated as queue-capacity * maximum-system-applications But for absolute resource configuration, maximum applications is calculated as (3) in ParentQueue#deriveCapacityFromAbsoluteConfigurations, this is a strict limit for high max-capacity and low capacity queues which have little guaranteed resources but want to use lots of share resources. So I propose to share the maximum applications calculation of percentage based capacity, absolute resource can call the same calculation if necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8925) Updating distributed node attributes only when necessary
Tao Yang created YARN-8925: -- Summary: Updating distributed node attributes only when necessary Key: YARN-8925 URL: https://issues.apache.org/jira/browse/YARN-8925 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 3.2.1 Reporter: Tao Yang Assignee: Tao Yang Currently if distributed node attributes exist, even though there is no change, updating for distributed node attributes will happen in every heartbeat between NM and RM. Updating process will hold NodeAttributesManagerImpl#writeLock and may have some influence in a large cluster. We have found nodes UI of a large cluster is opened slowly and most time it's waiting for the lock in NodeAttributesManagerImpl. I think this updating should be called only when necessary to enhance the performance of related process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource
Tao Yang created YARN-8917: -- Summary: Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource Key: YARN-8917 URL: https://issues.apache.org/jira/browse/YARN-8917 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.1 Reporter: Tao Yang Assignee: Tao Yang Absolute capacity should be equal to multiply capacity by parent-queue's absolute-capacity, but currently it's calculated as dividing capacity by parent-queue's absolute-capacity. Calculation for absolute-maximum-capacity has the same problem. For example: root.a capacity=0.4 maximum-capacity=0.8 root.a.a1 capacity=0.5 maximum-capacity=0.6 Absolute capacity of root.a.a1 should be 0.2 but is wrongly calculated as 1.25 Absolute maximum capacity of root.a.a1 should be 0.48 but is wrongly calculated as 0.75 Moreover: {{childQueue.getQueueCapacities().getCapacity()}} should be changed to {{childQueue.getQueueCapacities().getCapacity(label)}} to avoid getting wrong capacity from default partition when calculating for a non-default partition. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
Tao Yang created YARN-8804: -- Summary: resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues Key: YARN-8804 URL: https://issues.apache.org/jira/browse/YARN-8804 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang This problem is due to YARN-4280, parent queue will deduct child queue's headroom when the child queue reached its resource limit and the skipped type is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly calculated, but for non-deepest parent queue, its headroom may be much more than the sum of reached-limit child queues' headroom, so that the resource limit of non-deepest parent may be much less than its true value and block the allocation for later queues. To reproduce this problem with UT: (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 3-level queues as below, among them max-capacity of "c1" is 10 and others are all 100, so that max-capacity of queue "c1" is <2GB, 2core> Root / |\ a bc 10 20 70 | \ c1 c2 10(max=10) 90 (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 (4) app1 and app2 both ask one <2GB, 1core> containers. (5) nm1 do 1 heartbeat Now queue "c" has lower capacity percentage than queue "b", the allocation sequence will be "a" -> "c" -> "b", queue "c1" has reached queue limit so that requests of app1 should be pending, headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), headroom of queue "c" is <18GB, 18core> (=max-capacity - used), after allocation for queue "c", resource limit of queue "b" will be wrongly calculated as <2GB, 2core>, headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8774) Memory leak when CapacityScheduler allocates from reserved container with non-default label
Tao Yang created YARN-8774: -- Summary: Memory leak when CapacityScheduler allocates from reserved container with non-default label Key: YARN-8774 URL: https://issues.apache.org/jira/browse/YARN-8774 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Reproduce memory leak: (1) create reserved container RegularContainerAllocator#doAllocation: create RMContainerImpl instanceA (nodeLabelExpression="") LeafQueue#allocateResource: RMContainerImpl instanceA is put into LeafQueue#ignorePartitionExclusivityRMContainers (2) allocate from reserved container RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB (nodeLabelExpression="test-label") (3) From now on, RMContainerImpl instanceA will be left in memory (be kept in LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type
Tao Yang created YARN-8771: -- Summary: CapacityScheduler fails to unreserve when cluster resource contains empty resource type Key: YARN-8771 URL: https://issues.apache.org/jira/browse/YARN-8771 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used. Reproduce this problem: (1) use DominantResourceCalculator (2) cluster resource has empty resource type, for example: gpu=0 (3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit). Reference codes in RegularContainerAllocator#assignContainer: {code:java} boolean needToUnreserve = Resources.greaterThan(rc, clusterResource, resourceNeedToUnReserve, Resources.none()); {code} value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu>, result of {{Resources#greaterThan}} will be false if using DominantResourceCalculator. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8737) Race condition in ParentQueue when reinitializing and sorting child queues in the meanwhile
Tao Yang created YARN-8737: -- Summary: Race condition in ParentQueue when reinitializing and sorting child queues in the meanwhile Key: YARN-8737 URL: https://issues.apache.org/jira/browse/YARN-8737 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Administrator raised a update for queues through REST API, in RM parent queue is refreshing child queues through calling ParentQueue#reinitialize, meanwhile, async-schedule threads is sorting child queues when calling ParentQueue#sortAndGetChildrenAllocationIterator. Race condition may happen and throw exception as follow because TimSort does not handle the concurrent modification of objects it is sorting: {noformat} java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:899) at java.util.TimSort.mergeAt(TimSort.java:516) at java.util.TimSort.mergeCollapse(TimSort.java:441) at java.util.TimSort.sort(TimSort.java:245) at java.util.Arrays.sort(Arrays.java:1512) at java.util.ArrayList.sort(ArrayList.java:1454) at java.util.Collections.sort(Collections.java:175) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:291) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:804) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:817) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:636) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:2494) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:2431) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersOnMultiNodes(CapacityScheduler.java:2588) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:2676) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.scheduleBasedOnNodeLabels(CapacityScheduler.java:927) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:962) {noformat} I think we can add read-lock for ParentQueue#sortAndGetChildrenAllocationIterator to solve this problem, the write-lock will be hold when updating child queues in ParentQueue#reinitialize. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8729) Node status updater thread could be lost after it restarted
Tao Yang created YARN-8729: -- Summary: Node status updater thread could be lost after it restarted Key: YARN-8729 URL: https://issues.apache.org/jira/browse/YARN-8729 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Today I found a lost NM whose node status updater thread was not exist after this thread restarted. In {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}, isStopped flag is not updated to be false before executing {{statusUpdater.start()}}, so that if the thread is immediately started and found isStopped==true, it will exit without any log. Key codes in {{NodeStatusUpdaterImpl#rebootNodeStatusUpdaterAndRegisterWithRM}}: {code:java} statusUpdater.join(); registerWithRM(); statusUpdater = new Thread(statusUpdaterRunnable, "Node Status Updater"); statusUpdater.start(); this.isStopped = false; //this line should be moved before statusUpdater.start(); LOG.info("NodeStatusUpdater thread is reRegistered and restarted"); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8728) Wrong available resource in AllocateResponse in queue with multiple partitions
Tao Yang created YARN-8728: -- Summary: Wrong available resource in AllocateResponse in queue with multiple partitions Key: YARN-8728 URL: https://issues.apache.org/jira/browse/YARN-8728 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang Recently I found some apps' available resource in AllocateResponse was changing between two different resources. After check the code, I think {{LeafQueue#queueResourceLimitsInfo}} is wrongly updated in {{LeafQueue#computeUserLimitAndSetHeadroom}} for all partitions, and this data should be updated only for default partition. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8709) intra-queue preemption checker always fail since one under-served queue was deleted
Tao Yang created YARN-8709: -- Summary: intra-queue preemption checker always fail since one under-served queue was deleted Key: YARN-8709 URL: https://issues.apache.org/jira/browse/YARN-8709 Project: Hadoop YARN Issue Type: Bug Components: scheduler preemption Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang After some queues deleted, the preemption checker in SchedulingMonitor was always skipped because of YarnRuntimeException for every run. Error logs: {noformat} ERROR [SchedulingMonitor (ProportionalCapacityPreemptionPolicy)] org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor: Exception raised while executing preemption checker, skip this run..., exception= org.apache.hadoop.yarn.exceptions.YarnRuntimeException: This shouldn't happen, cannot find TempQueuePerPartition for queueName=1535075839208 at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.getQueueByPartition(ProportionalCapacityPreemptionPolicy.java:701) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.IntraQueueCandidatesSelector.computeIntraQueuePreemptionDemand(IntraQueueCandidatesSelector.java:302) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.IntraQueueCandidatesSelector.selectCandidates(IntraQueueCandidatesSelector.java:128) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:514) at org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:348) at org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:99) at org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PolicyInvoker.run(SchedulingMonitor.java:111) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:186) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:300) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) at java.lang.Thread.run(Thread.java:834) {noformat} I think there is something wrong with partitionToUnderServedQueues field in ProportionalCapacityPreemptionPolicy. Items of partitionToUnderServedQueues can be add but never be removed, except rebuilding this policy. For example, once under-served queue "a" is added into this structure, it will always be there and never be removed, when queue "a" is deleted from queue structure, intra-queue preemption checker will try to get all queues info for partitionToUnderServedQueues in IntraQueueCandidatesSelector#selectCandidates, so that it will throw YarnRuntimeException when not found queue "a", then the preemption checker will be skipped. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8693) Add signalToContainer REST API for RMWebServices
Tao Yang created YARN-8693: -- Summary: Add signalToContainer REST API for RMWebServices Key: YARN-8693 URL: https://issues.apache.org/jira/browse/YARN-8693 Project: Hadoop YARN Issue Type: Improvement Components: restapi Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently YARN has a RPC command which is "yarn container -signal " to signal OUTPUT_THREAD_DUMP/GRACEFUL_SHUTDOWN/FORCEFUL_SHUTDOWN commands to container. That is not enough and we need to add signalToContainer REST API for better management from cluster administrators or management system. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8692) Support node utilization metrics for SLS
Tao Yang created YARN-8692: -- Summary: Support node utilization metrics for SLS Key: YARN-8692 URL: https://issues.apache.org/jira/browse/YARN-8692 Project: Hadoop YARN Issue Type: Improvement Components: scheduler-load-simulator Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Attachments: image-2018-08-21-17-50-04-011.png The distribution of node utilization is an important healthy factor for the YARN cluster, related metrics in SLS can be used to evaluate the scheduling effects and optimize related configurations. To implement this improvement, we need to do things as below: (1) Add input configurations (contain avg and stddev for cpu/memory utilization ratio) and generate utilization samples for tasks, not include AM container cause I think it's negligible. (2) Simulate containers and node utilization within node status. (3) calculate and generate the distribution metrics and use standard deviation metric (stddev for short) to evaluate the effects(smaller is better). (4) show these metrics on SLS simulator page like this: !image-2018-08-21-17-50-04-011.png! For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, and P0 represents 0%~9% utilization ratio(containers-utilization / node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization ratio. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
Tao Yang created YARN-8685: -- Summary: Add containers query support for nodes/node REST API in RMWebServices Key: YARN-8685 URL: https://issues.apache.org/jira/browse/YARN-8685 Project: Hadoop YARN Issue Type: Bug Components: restapi Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently we can only query running containers from NM containers REST API, but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We have the requirements to get all containers allocated on specified nodes for debugging or managing. I think we can add a "includeContainers" query param (default false) for nodes/node REST API in RMWebServices, so that we can get valid containers on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
Tao Yang created YARN-8683: -- Summary: Support scheduling request for outstanding requests info in RMAppAttemptBlock Key: YARN-8683 URL: https://issues.apache.org/jira/browse/YARN-8683 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently outstanding requests info in app attempt page only show pending resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8575) CapacityScheduler should check node state before committing reserve/allocate proposals
Tao Yang created YARN-8575: -- Summary: CapacityScheduler should check node state before committing reserve/allocate proposals Key: YARN-8575 URL: https://issues.apache.org/jira/browse/YARN-8575 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0, 3.1.2 Reporter: Tao Yang Assignee: Tao Yang Recently we found a new error as follows: {noformat} ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: node to unreserve doesn't exist, nodeid: host1:45454 {noformat} Reproduce this problem: (1) Create a reserve proposal for app1 on node1 (2) node1 is successfully decommissioned and removed from node tracker (3) Try to commit this outdated reserve proposal, it will be accepted and applied. This error may be occurred after decommissioning some NMs. The application who print the error log will always have a reserved container on non-exist (decommissioned) NM and the pending request will never be satisfied. To solve this problem, scheduler should check node state in FiCaSchedulerApp#accept to avoid committing outdated proposals on unusable nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8505) AMLimit and userAMLimit check should be skipped for unmanaged AM
Tao Yang created YARN-8505: -- Summary: AMLimit and userAMLimit check should be skipped for unmanaged AM Key: YARN-8505 URL: https://issues.apache.org/jira/browse/YARN-8505 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0, 2.9.2 Reporter: Tao Yang Assignee: Tao Yang AMLimit and userAMLimit check in LeafQueue#activateApplications should be skipped for unmanaged AM whose resource is not taken from YARN cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8443) Cluster metrics have wrong Total VCores when there is reserved container for CapacityScheduler
Tao Yang created YARN-8443: -- Summary: Cluster metrics have wrong Total VCores when there is reserved container for CapacityScheduler Key: YARN-8443 URL: https://issues.apache.org/jira/browse/YARN-8443 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.1.0, 2.9.0, 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Cluster metrics on the web UI will give wrong Total Vcores when there is reserved container for CapacityScheduler. Reference code: {code:java|title=ClusterMetricsInfo.java} if (rs instanceof CapacityScheduler) { CapacityScheduler cs = (CapacityScheduler) rs; this.totalMB = availableMB + allocatedMB + reservedMB; this.totalVirtualCores = availableVirtualCores + allocatedVirtualCores + containersReserved; ... } {code} The key of this problem is the calculation of totalVirtualCores, {{containersReserved}} is the number of reserved containers, not reserved VCores. The correct calculation should be {{this.totalVirtualCores = availableVirtualCores + allocatedVirtualCores + reservedVirtualCores;}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8245) Validate the existence of parent queue for add-queue operation in scheduler-conf REST API
Tao Yang created YARN-8245: -- Summary: Validate the existence of parent queue for add-queue operation in scheduler-conf REST API Key: YARN-8245 URL: https://issues.apache.org/jira/browse/YARN-8245 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Assignee: Tao Yang Now there is no existence validation of parent queue for add-queue operation in scheduler-conf REST API, this may create lots of invalid queues successfully without actual validations and can cause potential problems later. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal but its allocatedOrReservedContainer is null
Tao Yang created YARN-8233: -- Summary: NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal but its allocatedOrReservedContainer is null Key: YARN-8233 URL: https://issues.apache.org/jira/browse/YARN-8233 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from an allocate/reserve proposal. But got null allocatedOrReservedContainer and thrown NPE. Reference code: {code:java} // find the application to accept and apply the ResourceCommitRequest if (request.anythingAllocatedOrReserved()) { ContainerAllocationProposalc = request.getFirstAllocatedOrReservedContainer(); attemptId = c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() .getApplicationAttemptId(); //NPE happens here } else { ... {code} The proposal was constructed in {{CapacityScheduler#createResourceCommitRequest}} and allocatedOrReservedContainer is possibly null in async-scheduling process when node was lost or application was finished (details in {{CapacityScheduler#getSchedulerContainer}}). Reference code: {code:java} // Allocated something List allocations = csAssignment.getAssignmentInformation().getAllocationDetails(); if (!allocations.isEmpty()) { RMContainer rmContainer = allocations.get(0).rmContainer; allocated = new ContainerAllocationProposal<>( getSchedulerContainer(rmContainer, true), //possibly null getSchedulerContainersToRelease(csAssignment), getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), false), csAssignment.getType(), csAssignment.getRequestLocalityType(), csAssignment.getSchedulingMode() != null ? csAssignment.getSchedulingMode() : SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, csAssignment.getResource()); } {code} I think we should add null check for allocateOrReserveContainer before create allocate/reserve proposals. Besides the allocation process has increase unconfirmed resource of app when creating an allocate assignment, so if this check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8222) NPE in RMContainerImpl$FinishedTransition#updateAttemptMetrics
Tao Yang created YARN-8222: -- Summary: NPE in RMContainerImpl$FinishedTransition#updateAttemptMetrics Key: YARN-8222 URL: https://issues.apache.org/jira/browse/YARN-8222 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Assignee: Tao Yang This NPE looks like may happen when node heartbeat delay and try to update attempt metrics for a non-exist application. Reference code of RMContainerImpl$FinishedTransition#updateAttemptMetrics: {code:java} private static void updateAttemptMetrics(RMContainerImpl container) { Resource resource = container.getContainer().getResource(); RMAppAttempt rmAttempt = container.rmContext.getRMApps() .get(container.getApplicationAttemptId().getApplicationId()) .getCurrentAppAttempt(); if (rmAttempt != null) { // } } {code} We can add a null check for application before getting attempt form it. Error log: {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.updateAttemptMetrics(RMContainerImpl.java:742) at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.transition(RMContainerImpl.java:715) at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl$FinishedTransition.transition(RMContainerImpl.java:699) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:482) at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:64) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.containerCompleted(FiCaSchedulerApp.java:195) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1793) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:2624) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:663) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:1514) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:2396) at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205) at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:834) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active
Tao Yang created YARN-8085: -- Summary: RMContext#resourceProfilesManager is lost after RM went standby then back to active Key: YARN-8085 URL: https://issues.apache.org/jira/browse/YARN-8085 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang We submited a distributed shell application after RM failover and back to active, then got NPE error in RM log: {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {noformat} The cause is that currently resourceProfilesManager is not transferred to new RMContext instance in RMContext#resetRMContext. We should do this transfer to fix this error. {code:java} @@ -1488,6 +1488,10 @@ private void resetRMContext() { // transfer service context to new RM service Context rmContextImpl.setServiceContext(rmContext.getServiceContext()); +// transfer resource profiles manager +rmContextImpl +.setResourceProfilesManager(rmContext.getResourceProfilesManager()); + // reset dispatcher Dispatcher dispatcher = setupDispatcher(); ((Service) dispatcher).init(this.conf); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8011) TestOpportunisticContainerAllocatorAMService#testContainerPromoteAndDemoteBeforeContainerStart fails sometimes in trunk
Tao Yang created YARN-8011: -- Summary: TestOpportunisticContainerAllocatorAMService#testContainerPromoteAndDemoteBeforeContainerStart fails sometimes in trunk Key: YARN-8011 URL: https://issues.apache.org/jira/browse/YARN-8011 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Assignee: Tao Yang TestOpportunisticContainerAllocatorAMService#testContainerPromoteAndDemoteBeforeContainerStart often pass, but the following errors sometimes occur: {noformat} java.lang.AssertionError: Expected :15360 Actual :14336 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService.verifyMetrics(TestOpportunisticContainerAllocatorAMService.java:732) at org.apache.hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService.testContainerPromoteAndDemoteBeforeContainerStart(TestOpportunisticContainerAllocatorAMService.java:330) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {noformat} This problem is caused by that deducting resource is a little behind the assertion. To solve this problem, It can sleep a while before this assertion as below. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7751) Decommissioned NM leaves orphaned containers
Tao Yang created YARN-7751: -- Summary: Decommissioned NM leaves orphaned containers Key: YARN-7751 URL: https://issues.apache.org/jira/browse/YARN-7751 Project: Hadoop YARN Issue Type: Bug Reporter: Tao Yang Recently we found some orphaned containers running on a decommissioned NM in our production cluster. The beginning of this problem is PCIE error of this node, one of local directories is not writable so that containers whose pid files located on it can't be cleanup successfully, after a few moments, NM changed to DECOMMISSIONED state and exited. Corresponding logs in NM: {noformat} 2018-01-12 21:31:38,495 WARN [DiskHealthMonitor-Timer] org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /dump/2/nm-logs error, Directory is not writable: /dump/2/nm-logs, removing from list of valid directories 2018-01-12 21:41:23,352 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_e37_1508697357114_216838_01_001812 2018-01-12 21:41:25,601 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Could not get pid for container_e37_1508697357114_216838_01_001812. Waited for 2000 ms. {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time
Tao Yang created YARN-7636: -- Summary: Re-reservation count may overflow when cluster resource exhausted for a long time Key: YARN-7636 URL: https://issues.apache.org/jira/browse/YARN-7636 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha4, 2.9.1 Reporter: Tao Yang Assignee: Tao Yang Exception stack: {noformat} java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count of 2147483647 at com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246) at com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80) at com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546) {noformat} We can add check condition {{getReReservations(schedulerKey) < Integer.MAX_VALUE}} before addReReservation to avoid this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7621) Support submitting apps with queue path for CapacityScheduler
Tao Yang created YARN-7621: -- Summary: Support submitting apps with queue path for CapacityScheduler Key: YARN-7621 URL: https://issues.apache.org/jira/browse/YARN-7621 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Reporter: Tao Yang Priority: Minor Currently there is a difference of queue definition in ApplicationSubmissionContext between CapacityScheduler and FairScheduler. FairScheduler needs queue path but CapacityScheduler needs queue name. There is no doubt of the correction of queue definition for CapacityScheduler because it does not allow duplicate leaf queue names, but it's hard to switch between FairScheduler and CapacityScheduler. I propose to support submitting apps with queue path for CapacityScheduler to make the interface clearer and scheduler switch smoothly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7593) SessionMovedException in ZKRMStateStore and can't auto recover after ZK connection timeout
Tao Yang created YARN-7593: -- Summary: SessionMovedException in ZKRMStateStore and can't auto recover after ZK connection timeout Key: YARN-7593 URL: https://issues.apache.org/jira/browse/YARN-7593 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.9.0 Reporter: Tao Yang RM may throw SessionMovedException and can't recover ZKRMStateStore after zk connection timeout. In our case, after connection with zk-server-5 timeout, zk client in ZKRMStateStore reconnected with zk-server-1 and timeout again, then reconnected to zk-server-4. After zk cluster backed to normal, zk client in ZKRMStateStore still can't recover and continued to throw SessionMovedException with fixed interval(about half a hour). The logs of zk servers show that it still try to connect with zk-server-5(outdated connection) but not zk-server-4(latest connection). Exception stack: {noformat} ERROR [AsyncDispatcher event handler] org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error storing app: application_1498833634675_173952 org.apache.zookeeper.KeeperException$SessionMovedException: KeeperErrorCode = Session moved at org.apache.zookeeper.KeeperException.create(KeeperException.java:131) at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949) at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) at org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159) at org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44) at org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129) at org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$SafeTransaction.commit(ZKRMStateStore.java:943) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.safeCreate(ZKRMStateStore.java:903) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:563) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:213) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:195) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:1033) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:1114) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:1109) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) at java.lang.Thread.run(Thread.java:834) {noformat} RM logs: {noformat} 2017-11-25 15:26:27,680 INFO [main-SendThread(zk-server-5:2181)] org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 8004ms for sessionid 0x55cf8f81ebd7f1a, closing socket connection and attempting reconnect 2017-11-25 15:26:27,781 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager: State change: SUSPENDED 2017-11-25 15:26:27,968 INFO [main-SendThread(zk-server-1:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server zk-server-1:2181. Will not attempt to authenticate using SASL (unknown error) 2017-11-25 15:26:27,968 INFO [main-SendThread(zk-server-1:2181)] org.apache.zookeeper.ClientCnxn: Socket connection established to zk-server-1:2181, initiating session 2017-11-25 15:26:28,683 INFO [Socket Reader #1 for port 8030] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1498833634675_173646_01 (auth:SIMPLE) 2017-11-25 15:26:29,060 INFO [Socket Reader #1 for port 8030] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
[jira] [Created] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
Tao Yang created YARN-7591: -- Summary: NPE in async-scheduling mode of CapacityScheduler Key: YARN-7591 URL: https://issues.apache.org/jira/browse/YARN-7591 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha4, 2.9.1 Reporter: Tao Yang Assignee: Tao Yang Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in special scenarios as below. (1) The user should be removed after its last application finished, NPE may be raised if getting something from user object without the null check in async-scheduling threads. (2) NPE may be raised when trying fulfill reservation for a finished application in {{CapacityScheduler#allocateContainerOnSingleNode}}. {code} RMContainer reservedContainer = node.getReservedContainer(); if (reservedContainer != null) { FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( reservedContainer.getContainerId()); // NPE here: reservedApplication could be null after this application finished // Try to fulfill the reservation LOG.info( "Trying to fulfill reservation for application " + reservedApplication .getApplicationId() + " on node: " + node.getNodeID()); {code} (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve containerY on node1) were generated by different async-scheduling threads around the same time and proposal2 was submitted in front of proposal1, NPE is raised when trying to submit proposal2 in {{FiCaSchedulerApp#commonCheckContainerAllocation}}. {code} if (reservedContainerOnNode != null) { // NPE here: allocation.getAllocateFromReservedContainer() should be null for proposal2 in this case RMContainer fromReservedContainer = allocation.getAllocateFromReservedContainer().getRmContainer(); if (fromReservedContainer != reservedContainerOnNode) { if (LOG.isDebugEnabled()) { LOG.debug( "Try to allocate from a non-existed reserved container"); } return false; } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7527) Over-allocate node resource in async-scheduling mode of CapacityScheduler
Tao Yang created YARN-7527: -- Summary: Over-allocate node resource in async-scheduling mode of CapacityScheduler Key: YARN-7527 URL: https://issues.apache.org/jira/browse/YARN-7527 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha4, 2.9.1 Reporter: Tao Yang Assignee: Tao Yang Currently in async-scheduling mode of CapacityScheduler, node resource may be over-allocated since node resource check is ignored. {{FiCaSchedulerApp#commonCheckContainerAllocation}} will check whether this node have enough available resource for this proposal and return check result (ture/false), but this result is ignored in {{CapacityScheduler#accept}} as below. {noformat} commonCheckContainerAllocation(allocation, schedulerContainer); {noformat} If {{FiCaSchedulerApp#commonCheckContainerAllocation}} returns false, {{CapacityScheduler#accept}} should also return false as below: {noformat} if (!commonCheckContainerAllocation(allocation, schedulerContainer)) { return false; } {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7525) Incorrect query parameters in cluster nodes REST API document
Tao Yang created YARN-7525: -- Summary: Incorrect query parameters in cluster nodes REST API document Key: YARN-7525 URL: https://issues.apache.org/jira/browse/YARN-7525 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 3.0.0-alpha4, 2.9.1 Reporter: Tao Yang Assignee: Tao Yang Priority: Minor Recently we use cluster nodes REST API and found the query parameters(state and healthy) in document both are not exist. Now the query paramters in document is: {noformat} * state - the state of the node * healthy - true or false {noformat} The correct query parameters should be: {noformat} * states - the states of the node, specified as a comma-separated list. {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7508) NPE in FiCaSchedulerApp when debug log enabled and try to commit outdated reserved proposal in async-scheduling mode
Tao Yang created YARN-7508: -- Summary: NPE in FiCaSchedulerApp when debug log enabled and try to commit outdated reserved proposal in async-scheduling mode Key: YARN-7508 URL: https://issues.apache.org/jira/browse/YARN-7508 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha4, 2.9.0 Reporter: Tao Yang Assignee: Tao Yang YARN-6678 have fixed the IllegalStateException problem but the debug log it added may cause NPE when trying to print containerId of non-existed reserved container on this node. Replace {{schedulerContainer.getSchedulerNode().getReservedContainer().getContainerId()}} with {{schedulerContainer.getSchedulerNode().getReservedContainer()}} can fix this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7489) ConcurrentModificationException in RMAppImpl#getRMAppMetrics
Tao Yang created YARN-7489: -- Summary: ConcurrentModificationException in RMAppImpl#getRMAppMetrics Key: YARN-7489 URL: https://issues.apache.org/jira/browse/YARN-7489 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang The REST clients have sometimes failed to query applications through apps REST API in RMWebService and it happened when iterating attempts(RMWebServices#getApps --> AppInfo# --> RMAppImpl#getRMAppMetrics) and meanwhile these attempts changed(AttemptFailedTransition#transition --> RMAppImpl#createAndStartNewAttempt --> RMAppImpl#createNewAttempt). Application state changed within the lockup period of writeLock in RMAppImpl, so that we can add readLock before iterating attempts to fix this problem. Error logs: {noformat} java.util.ConcurrentModificationException at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719) at java.util.LinkedHashMap$LinkedValueIterator.next(LinkedHashMap.java:747) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.getRMAppMetrics(RMAppImpl.java:1487) at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo.(AppInfo.java:199) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:597) at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:178) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7471) queueUsagePercentage is wrongly calculated for applications in zero-capacity queues
Tao Yang created YARN-7471: -- Summary: queueUsagePercentage is wrongly calculated for applications in zero-capacity queues Key: YARN-7471 URL: https://issues.apache.org/jira/browse/YARN-7471 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha4 Reporter: Tao Yang Assignee: Tao Yang For applicaitons in zero-capacity queues, queueUsagePercentage is wrongly calculated to INFINITY with expression (queueUsagePercentage = usedResource / (totalPartitionRes * queueAbsMaxCapPerPartition) when the queueAbsMaxCapPerPartition=0. We can add a precondition (queueAbsMaxCapPerPartition != 0) before this calculation to fix this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7461) DominantResourceCalculator#ratio calculation problem when right resource contains zero value
Tao Yang created YARN-7461: -- Summary: DominantResourceCalculator#ratio calculation problem when right resource contains zero value Key: YARN-7461 URL: https://issues.apache.org/jira/browse/YARN-7461 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0-alpha4 Reporter: Tao Yang Priority: Minor Currently DominantResourceCalculator#ratio may return wrong result when right resource contains zero value. For example, there are three resource types such as, leftResource=<5, 5, 0> and rightResource=<10, 10, 0>, we expect the result of DominantResourceCalculator#ratio(leftResource, rightResource) is 0.5 but currently is NaN. There should be a verification before divide calculation to ensure that dividend is not zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7037) Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
Tao Yang created YARN-7037: -- Summary: Optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices Key: YARN-7037 URL: https://issues.apache.org/jira/browse/YARN-7037 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.8.3 Reporter: Tao Yang Assignee: Tao Yang Split this improvement from YARN-6259. It's useful to read container logs more efficiently. With zero-copy approach, data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) can be optimized to pipeline(disk --> read buffer --> socket buffer) . In my local test, time cost of copying 256MB file with zero-copy can be reduced from 12 seconds to 2.5 seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
Tao Yang created YARN-7005: -- Summary: Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance Key: YARN-7005 URL: https://issues.apache.org/jira/browse/YARN-7005 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.0.0-alpha4, 2.9.0 Reporter: Tao Yang Nowadays even if there is only one pending app in a queue, the scheduling process will go through all queues anyway and costs most of time on sorting and iterating child queues in ParentQueue#assignContainersToChildQueues. IIUIC, queues that have no pending resource can be skipped for sorting and iterating process to reduce time cost, obviously for a cluster with many queues. Please feel free to correct me if I ignore something else. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7004) Add configs cache to optimize refreshQueues performance for large scale queues
Tao Yang created YARN-7004: -- Summary: Add configs cache to optimize refreshQueues performance for large scale queues Key: YARN-7004 URL: https://issues.apache.org/jira/browse/YARN-7004 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 3.0.0-alpha4, 2.9.0 Reporter: Tao Yang Assignee: Tao Yang We have requirements for large scale queues in our production environment to serve for many projects. So we did some tests for more than 5000 queues and found that refreshQueues process took more than 1 minute. The refreshQueues process costs most of time on iterating over all configurations to get accessible-node-labels and ordering-policy configs for every queue. Loading queue configs from cache should be beneficial to reduce time costs (optimized from 1 minutes to 3 seconds for 5000 queues in our test) when initializing/reinitializing queues. So I propose to load queue configs into cache in CapacityScheduler#initializeQueues and CapacityScheduler#reinitializeQueues. If cache has not be loaded on other scenes, such as in test cases, it still can get queue configs by iterating over all configurations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-7003) DRAINING state of queues can't be recovered after RM restart
Tao Yang created YARN-7003: -- Summary: DRAINING state of queues can't be recovered after RM restart Key: YARN-7003 URL: https://issues.apache.org/jira/browse/YARN-7003 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha3 Reporter: Tao Yang DRAINING state is a temporary state in RM memory, when queue state is set to be STOPPED but there are still some pending or active apps in it, the queue state will be changed to DRAINING instead of STOPPED after refreshing queues. We've encountered the problem that the state of this queue will aways be STOPPED after RM restarted, so that it can be removed at any time and leave some apps in a non-existing queue. To fix this problem, we could recover DRAINING state in the recovery process of pending/active apps. I will upload a patch with test case later for review. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6044) Resource bar of Capacity Scheduler UI does not show correctly
[ https://issues.apache.org/jira/browse/YARN-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang resolved YARN-6044. Resolution: Duplicate > Resource bar of Capacity Scheduler UI does not show correctly > - > > Key: YARN-6044 > URL: https://issues.apache.org/jira/browse/YARN-6044 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.8.0 >Reporter: Tao Yang >Priority: Minor > > Test Environment: > 1. NodeLable > yarn rmadmin -addToClusterNodeLabels "label1(exclusive=false)" > 2. capacity-scheduler.xml > yarn.scheduler.capacity.root.queues=a,b > yarn.scheduler.capacity.root.a.capacity=60 > yarn.scheduler.capacity.root.b.capacity=40 > yarn.scheduler.capacity.root.a.accessible-node-labels=label1 > yarn.scheduler.capacity.root.accessible-node-labels.label1.capacity=100 > yarn.scheduler.capacity.root.a.accessible-node-labels.label1.capacity=100 > In this test case, for queue(root.b) in partition(label1), the resource > bar(represents absolute-max-capacity) should be 100%(default). The scheduler > UI shows correctly after RM started, but when I started an app in > queue(root.b) and partition(label1) , the resource bar of this queue is > changed from 100% to 0%. > For corrent queue(root.a), the queueCapacities of partition(label1) was > inited in ParentQueue/LeafQueue constructor and > max-capacity/absolute-max-capacity were setted with correct value, due to > yarn.scheduler.capacity.root.a.accessible-node-labels is defined in > capacity-scheduler.xml > For incorrent queue(root.b), the queueCapacities of partition(label1) didn't > exist at first, the max-capacity and absolute-max-capacity were setted with > default value(100%) in PartitionQueueCapacitiesInfo so that Scheduler UI > could show correctly. When this queue was allocating resource for > partition(label1), the queueCapacities of partition(label1) was created and > only used-capacity and absolute-used-capacity were setted in > AbstractCSQueue#allocateResource. max-capacity and absolute-max-capacity have > to use float default value 0 which are defined in QueueCapacities$Capacities. > Whether max-capacity and absolute-max-capacity should have default > value(100%) in Capacities constructor to avoid losing default value if > somewhere called not given? > Please feel free to give your suggestions. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6737) Rename getApplicationAttempt to getCurrentAttempt in AbstractYarnScheduler/CapacityScheduler
Tao Yang created YARN-6737: -- Summary: Rename getApplicationAttempt to getCurrentAttempt in AbstractYarnScheduler/CapacityScheduler Key: YARN-6737 URL: https://issues.apache.org/jira/browse/YARN-6737 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.0.0-alpha3, 2.9.0 Reporter: Tao Yang Priority: Minor As discussed in YARN-6714 (https://issues.apache.org/jira/browse/YARN-6714?focusedCommentId=16052158=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16052158) AbstractYarnScheduler#getApplicationAttempt is inconsistent to its name, it discarded application_attempt_id and always return the latest attempt. We should: 1) Rename it to getCurrentAttempt, 2) Change parameter from attemptId to applicationId. 3) Took a scan of all usages to see if any similar issue could happen. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6714) RM crashed with IllegalStateException while handling APP_ATTEMPT_REMOVED event when async-scheduling enabled in CapacityScheduler
Tao Yang created YARN-6714: -- Summary: RM crashed with IllegalStateException while handling APP_ATTEMPT_REMOVED event when async-scheduling enabled in CapacityScheduler Key: YARN-6714 URL: https://issues.apache.org/jira/browse/YARN-6714 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0-alpha3, 2.9.0 Reporter: Tao Yang Assignee: Tao Yang Currently in async-scheduling mode of CapacityScheduler, after AM failover and unreserve all reserved containers, it still have chance to get and commit the outdated reserve proposal of the failed app attempt. This problem happened on an app in our cluster, when this app stopped, it unreserved all reserved containers and compared these appAttemptId with current appAttemptId, if not match it will throw IllegalStateException and make RM crashed. Error log: {noformat} 2017-06-08 11:02:24,339 FATAL [ResourceManager Event Processor] org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.IllegalStateException: Trying to unreserve for application appattempt_1495188831758_0121_02 when currently reserved for application application_1495188831758_0121 on node host: node1:45454 #containers=2 available=... used=... at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode.unreserveResource(FiCaSchedulerNode.java:123) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.unreserve(FiCaSchedulerApp.java:845) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1787) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1957) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:586) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:966) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1740) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:152) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:822) at java.lang.Thread.run(Thread.java:834) {noformat} When async-scheduling enabled, CapacityScheduler#doneApplicationAttempt and CapacityScheduler#tryCommit both need to get write_lock before executing, so we can check the app attempt state in commit process to avoid committing outdated proposals. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6678) Committer thread crashes with IllegalStateException in async-scheduling mode of CapacityScheduler
Tao Yang created YARN-6678: -- Summary: Committer thread crashes with IllegalStateException in async-scheduling mode of CapacityScheduler Key: YARN-6678 URL: https://issues.apache.org/jira/browse/YARN-6678 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0-alpha3, 2.9.0 Reporter: Tao Yang Error log: {noformat} java.lang.IllegalStateException: Trying to reserve container container_e10_1495599791406_7129_01_001453 for application appattempt_1495599791406_7129_01 when currently reserved container container_e10_1495599791406_7123_01_001513 on node host: node0123:45454 #containers=40 available=... used=... at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode.reserveResource(FiCaSchedulerNode.java:81) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1079) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546) {noformat} Reproduce this problem: 1. nm1 re-reserved app-1/container-X1 and generated reserved proposal-1 2. nm2 has enough resource for app-1, un-reserved app-1/container-X1 and allocated app-1/container-X2 3. nm1 reserved app-2/container-Y 4. proposal-1 was accepted but throw IllegalStateException when applying Currently the check code for reserve proposal in FiCaSchedulerApp#accept as follows: {code} // Container reserved first time will be NEW, after the container // accepted & confirmed, it will become RESERVED state if (schedulerContainer.getRmContainer().getState() == RMContainerState.RESERVED) { // Set reReservation == true reReservation = true; } else { // When reserve a resource (state == NEW is for new container, // state == RUNNING is for increase container). // Just check if the node is not already reserved by someone if (schedulerContainer.getSchedulerNode().getReservedContainer() != null) { if (LOG.isDebugEnabled()) { LOG.debug("Try to reserve a container, but the node is " + "already reserved by another container=" + schedulerContainer.getSchedulerNode() .getReservedContainer().getContainerId()); } return false; } } {code} The reserved container on the node of reserve proposal will be checked only for first-reserve container, not for the re-reserve container. I think FiCaSchedulerApp#accept should do this check for all reserve proposal not matter if the container is re-reserve or not. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before
Tao Yang created YARN-6629: -- Summary: NPE occurred when container allocation proposal is applied but its resource requests are removed before Key: YARN-6629 URL: https://issues.apache.org/jira/browse/YARN-6629 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0-alpha2, 2.9.0 Reporter: Tao Yang Assignee: Tao Yang Error log: {code} FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in handling event type NODE_UPDATE to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516) at org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225) at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31) at org.mockito.internal.MockHandler.handle(MockHandler.java:97) at org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply() at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:745) {code} Reproduce this error in chronological order: 1. AM started and requested 1 container with schedulerRequestKey#1 : ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests Added schedulerRequestKey#1 into schedulerKeyToPlacementSets 2. Scheduler allocatd 1 container for this request and accepted the proposal 3. AM removed this request ApplicationMasterService#allocate --> CapacityScheduler#allocate --> SchedulerApplicationAttempt#updateResourceRequests --> AppSchedulingInfo#updateResourceRequests --> AppSchedulingInfo#addToPlacementSets --> AppSchedulingInfo#updatePendingResources Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets) 4. Scheduler applied this proposal and wanted to deduct the pending resource CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> AppSchedulingInfo#allocate Throw NPE when called schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, type, node); -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6403) Invalid local resource request can raise NPE and make NM exit
Tao Yang created YARN-6403: -- Summary: Invalid local resource request can raise NPE and make NM exit Key: YARN-6403 URL: https://issues.apache.org/jira/browse/YARN-6403 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.8.0 Reporter: Tao Yang Recently we found this problem on our testing environment. The app that caused this problem added a invalid local resource request(have no location) into ContainerLaunchContext like this: {code} localResources.put("test", LocalResource.newInstance(location, LocalResourceType.FILE, LocalResourceVisibility.PRIVATE, 100, System.currentTimeMillis())); ContainerLaunchContext amContainer = ContainerLaunchContext.newInstance(localResources, environment, vargsFinal, null, securityTokens, acls); {code} The actual value of location was null although app doesn't expect that. This mistake cause several NMs exited with the NPE below and can't restart until the nm recovery dirs were deleted. {code} java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.(LocalResourceRequest.java:46) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:711) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:660) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1320) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:88) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1293) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1286) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) at java.lang.Thread.run(Thread.java:745) {code} NPE occured when created LocalResourceRequest instance for invalid resource request. {code} public LocalResourceRequest(LocalResource resource) throws URISyntaxException { this(resource.getResource().toPath(), //NPE occurred here resource.getTimestamp(), resource.getType(), resource.getVisibility(), resource.getPattern()); } {code} We can't guarantee the validity of local resource request now, but we could avoid damaging the cluster. Perhaps we can verify the resource both in ContainerLaunchContext and LocalResourceRequest? Please feel free to give your suggestions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6259) Support pagination and optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices
Tao Yang created YARN-6259: -- Summary: Support pagination and optimize data transfer with zero-copy approach for containerlogs REST API in NMWebServices Key: YARN-6259 URL: https://issues.apache.org/jira/browse/YARN-6259 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.8.1 Reporter: Tao Yang Assignee: Tao Yang Currently containerlogs REST API in NMWebServices will read and send the entire content of container logs. Most of container logs are large and it's useful to support pagination. * Add pagesize and pageindex parameters for containerlogs REST API {code} URL: http:///ws/v1/node/containerlogs// QueryParams: pagesize - max bytes of one page , default 1MB pageindex - index of required page, default 0, can be nagative(set -1 will get the last page content) {code} * Add containerlogs-info REST API since sometimes we need to know the totalSize/pageSize/pageCount info of log {code} URL: http:///ws/v1/node/containerlogs-info// QueryParams: pagesize - max bytes of one page , default 1MB Response example: {"logInfo":{"totalSize":2497280,"pageSize":1048576,"pageCount":3}} {code} Moreover, the data transfer pipeline (disk --> read buffer --> NM buffer --> socket buffer) can be optimized to pipeline(disk --> read buffer --> socket buffer) with zero-copy approach. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key
Tao Yang created YARN-6257: -- Summary: CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key Key: YARN-6257 URL: https://issues.apache.org/jira/browse/YARN-6257 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.8.1 Reporter: Tao Yang Assignee: Tao Yang Priority: Minor In response string of CapacityScheduler REST API, scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a JSON object : {code:json} "operationsInfo":{ "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}, "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}} } {code} To solve this problem, I suppose the type of operationsInfo field in CapacitySchedulerHealthInfo class should be converted from Map to List. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6044) Resource bar of Capacity Scheduler UI does not show correctly
Tao Yang created YARN-6044: -- Summary: Resource bar of Capacity Scheduler UI does not show correctly Key: YARN-6044 URL: https://issues.apache.org/jira/browse/YARN-6044 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.8.0 Reporter: Tao Yang Priority: Minor Test Environment: 1. NodeLable yarn rmadmin -addToClusterNodeLabels "label1(exclusive=false)" 2. capacity-scheduler.xml yarn.scheduler.capacity.root.queues=a,b yarn.scheduler.capacity.root.a.capacity=60 yarn.scheduler.capacity.root.b.capacity=40 yarn.scheduler.capacity.root.a.accessible-node-labels=label1 yarn.scheduler.capacity.root.accessible-node-labels.label1.capacity=100 yarn.scheduler.capacity.root.a.accessible-node-labels.label1.capacity=100 In this test case, for queue(root.b) in partition(label1), the resource bar(represents absolute-max-capacity) should be 100%(default). The scheduler UI shows correctly after RM started, but when I started an app in queue(root.b) and partition(label1) , the resource bar of this queue is changed from 100% to 0%. For corrent queue(root.a), the queueCapacities of partition(label1) was inited in ParentQueue/LeafQueue constructor and max-capacity/absolute-max-capacity were setted with correct value, due to yarn.scheduler.capacity.root.a.accessible-node-labels is defined in capacity-scheduler.xml For incorrent queue(root.b), the queueCapacities of partition(label1) didn't exist at first, the max-capacity and absolute-max-capacity were setted with default value(100%) in PartitionQueueCapacitiesInfo so that Scheduler UI could show correctly. When this queue was allocating resource for partition(label1), the queueCapacities of partition(label1) was created and only used-capacity and absolute-used-capacity were setted in AbstractCSQueue#allocateResource. max-capacity and absolute-max-capacity have to use float default value 0 which are defined in QueueCapacities$Capacities. Whether max-capacity and absolute-max-capacity should have default value(100%) in Capacities constructor to avoid losing default value if somewhere called not given? Please feel free to give your suggestions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-6029) CapacityScheduler deadlock when ParentQueue#getQueueUserAclInfo is called by Thread_A at the moment that Thread_B calls LeafQueue#assignContainers to release a reserved co
Tao Yang created YARN-6029: -- Summary: CapacityScheduler deadlock when ParentQueue#getQueueUserAclInfo is called by Thread_A at the moment that Thread_B calls LeafQueue#assignContainers to release a reserved container Key: YARN-6029 URL: https://issues.apache.org/jira/browse/YARN-6029 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.8.0 Reporter: Tao Yang Assignee: Tao Yang When ParentQueue#getQueueUserAclInfo is called (e.g. a client calls YarnClient#getQueueAclsInfo) just at the moment that LeafQueue#assignContainers is called and before notifying parent queue to release resource (should release a reserved container), then ResourceManager can deadlock. I found this problem on our testing environment for hadoop2.8. Reproduce the deadlock in chronological order * 1. Thread A (ResourceManager Event Processor) calls synchronized LeafQueue#assignContainers (got LeafQueue instance lock of queue root.a) * 2. Thread B (IPC Server handler) calls synchronized ParentQueue#getQueueUserAclInfo (got ParentQueue instance lock of queue root), iterates over children queue acls and is blocked when calling synchronized LeafQueue#getQueueUserAclInfo (the LeafQueue instance lock of queue root.a is hold by Thread A) * 3. Thread A wants to inform the parent queue that a container is being completed and is blocked when invoking synchronized ParentQueue#internalReleaseResource method (the ParentQueue instance lock of queue root is hold by Thread B) I think the synchronized modifier of LeafQueue#getQueueUserAclInfo can be removed to solve this problem, since this method appears to not affect fields of LeafQueue instance. Attach patch with UT for review. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5749) Fail to localize resources after health status for local dirs changed
[ https://issues.apache.org/jira/browse/YARN-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang resolved YARN-5749. Resolution: Duplicate > Fail to localize resources after health status for local dirs changed > - > > Key: YARN-5749 > URL: https://issues.apache.org/jira/browse/YARN-5749 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Tao Yang > > HADOOP-13440 updated FileContext#setUMask method to change umask from local > variable to global variable through updating conf value of > "fs.permissions.umask-mode". > This method might be called to update value for global umask by LogWriter and > ResourceLocalizationService. > After an application finished, LogWriter will update the umask value to be > "137" while uploading logs for containers. Then the global umask value is > updated right now and will affect other services. In my case , After one of > local directories is marked as bad (because the disk used space is above the > threshold defined by > "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage"), > ResourceLocalizationService will reinitailize the left local directories and > change the permission from "drwxr-xr-x" to "drw-r-"(umask value changed > from "022" to "137"). From now on, The NM will always fail to localize > resources as the local directories is not executable. > Detail logs are as follows: > {code} > 2016-10-19 15:36:32,650 WARN > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: Disk Error > Exception: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not > executable: /home/yangtao.yt/hadoop-data/nm-local-dir-2/nmPrivate > at > org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:215) > at > org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:190) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:124) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:350) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:412) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:116) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:563) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1162) > 2016-10-19 15:36:32,650 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Localizer failed > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for > nmPrivate/container_e26_1476858409240_0004_01_05.tokens > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:441) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:116) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:563) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1162) > 2016-10-19 15:36:32,652 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_e26_1476858409240_0004_01_05 transitioned from > LOCALIZING to LOCALIZATION_FAILED > {code} > To solve this problem, in my opinion, it's better if FileContext can be > compatible with past usage. > Please feel free to give your suggestions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5749) Fail to localize resources after health status for local dirs changed occurred by the change of FileContext#setUMask
Tao Yang created YARN-5749: -- Summary: Fail to localize resources after health status for local dirs changed occurred by the change of FileContext#setUMask Key: YARN-5749 URL: https://issues.apache.org/jira/browse/YARN-5749 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0-alpha2 Reporter: Tao Yang HADOOP-13440 updated FileContext#setUMask method to change umask from local variable to global variable through updating conf value of "fs.permissions.umask-mode". This method might be called to update value for global umask by LogWriter and ResourceLocalizationService. After an application finished, LogWriter will update the umask value to be "137" while uploading logs for containers. Then the global umask value is updated right now and will affect other services. In my case , After one of local directories is marked as bad (because the disk used space is above the threshold defined by "yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage"), ResourceLocalizationService will reinitailize the left local directories and change the permission from "drwxr-xr-x" to "drw-r-"(umask value changed from "022" to "137"). From now on, The NM will always fail to localize resources as the local directories is not executable. Detail logs are as follows: {code} 2016-10-19 15:36:32,650 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not executable: /home/yangtao.yt/hadoop-data/nm-local-dir-2/nmPrivate at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:215) at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:190) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:124) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:350) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:412) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:116) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:563) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1162) 2016-10-19 15:36:32,650 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for nmPrivate/container_e26_1476858409240_0004_01_05.tokens at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:441) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:116) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getLocalPathForWrite(LocalDirsHandlerService.java:563) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1162) 2016-10-19 15:36:32,652 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_e26_1476858409240_0004_01_05 transitioned from LOCALIZING to LOCALIZATION_FAILED {code} In my opinion, it's better if FileContext can compatible with past usage. Please feel free to give your suggestions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-5683) Support specifying storage type for per-application local dirs
Tao Yang created YARN-5683: -- Summary: Support specifying storage type for per-application local dirs Key: YARN-5683 URL: https://issues.apache.org/jira/browse/YARN-5683 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 3.0.0-alpha2 Reporter: Tao Yang Fix For: 3.0.0-alpha2 # Introduction * Some applications of various frameworks (Flink, Spark and MapReduce etc) using local storage (checkpoint, shuffle etc) might require high IO performance. It's useful to allocate local directories to high performance storage media for these applications on heterogeneous clusters. * YARN does not distinguish different storage types and hence applications cannot selectively use storage media with different performance characteristics. Adding awareness of storage media can allow YARN to make better decisions about the placement of local/log directories with input from applications. An application can choose the desired storage media by configuration based on its performance and requirements. # Approach * NodeManager will distinguish storage types for local directories. ** yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs configuration should allow the cluster administrator to optionally specify the storage type for each local directories. Example: [SSD]/disk1/nm-local-dir,/disk2/nm-local-dir,/disk3/nm-local-dir (equals to [SSD]/disk1/nm-local-dir,[DISK]/disk2/nm-local-dir,[DISK]/disk3/nm-local-dir) ** StorageType defines DISK/SSD storage types and takes DISK as the default storage type. ** StorageLocation separates storage type and directory path, used by LocalDirAllocator to aware the types of local dirs, the default storage type is DISK. ** getLocalPathForWrite method of LocalDirAllcator will prefer to choose the local directory of the specified storage type, and will fallback to not care storage type if the requirement can not be satisfied. ** Support for container related local/log directories by ContainerLaunch. All application frameworks can set the environment variables (LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE) to specified the desired storage type of local/log directories. * Allow specified storage type for various frameworks (Take MapReduce as an example) ** Add new configurations should allow application administrator to optionally specify the storage type of local/log directories. (MapReduce add configurations: mapreduce.job.local-storage-type and mapreduce.job.log-storage-type) ** Support for container work directories. Set the environment variables includes LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE according to configurations above for ContainerLaunchContext and ApplicationSubmissionContext. (MapReduce should update YARNRunner and TaskAttemptImpl) ** Add storage type prefix for request path to support for other local directories of frameworks (such as shuffle directories for MapReduce). (MapReduce should update YarnOutputFiles, MROutputFiles and YarnChild to support for output/work directories) # Further Discussion * The requirement of storage type for local/log directories may not be satisfied on heterogeneous clusters. To achieve global optimum, scheduler should aware and management disk resources to. [YARN-2139|https://issues.apache.org/jira/browse/YARN-2139] is close to that but seems not support multiple storage types, maybe we should do even more to aware the storage type of disk resource? * Node labels or node constraints can also make a higher chance to satisfy the requirement of specified storage type. * Fallback strategy still needs to be concerned. Certain applications might not work well when the requirement of storage type is not satisfied. When none of desired storage type disk are available, should container launching be failed? let AM handle? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org