[ https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wangda Tan updated YARN-8193: ----------------------------- Fix Version/s: 3.2.0 > YARN RM hangs abruptly (stops allocating resources) when running successive > applications. > ----------------------------------------------------------------------------------------- > > Key: YARN-8193 > URL: https://issues.apache.org/jira/browse/YARN-8193 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Reporter: Zian Chen > Assignee: Zian Chen > Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8193-branch-2.9.0-001.patch, YARN-8193.001.patch, > YARN-8193.002.patch > > > When running massive queries successively, at some point RM just hangs and > stops allocating resources. At the point RM get hangs, YARN throw > NullPointerException at RegularContainerAllocator.getLocalityWaitFactor. > There's sufficient space given to yarn.nodemanager.local-dirs (not a node > health issue, RM didn't report any node being unhealthy). There is no fixed > trigger for this (query or operation). > This problem goes away on restarting ResourceManager. No NM restart is > required. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org