Hi, My YARN resource manager is consuming 100% CPU when I am running an application that is running for about 10 hours, requesting as many as 27000 containers. The CPU consumption was very low at the starting of my application, and it gradually went high to over 100%. Is this a known issue or are we doing something wrong?
Every dump of the EVent Processor thread is running LeafQueue::assignContainers() specifically the for loop below from LeafQueue.java and seems to be looping through some priority list. // Try to assign containers to applications in order for (FiCaSchedulerApp application : activeApplications) { ... // Schedule in priority order for (Priority priority : application.getPriorities()) { 3XMTHREADINFO "ResourceManager Event Processor" J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, native policy:UNKNOWN) 3XMTHREADINFO2 (native stack address range from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) 3XMCPUTIME *CPU usage total: 42334.614623696 secs* 3XMHEAPALLOC Heap bytes allocated since last GC cycle=20456 (0x4FE8) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, entry count: 1) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 2) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) 3XMTHREADINFO "ResourceManager Event Processor" J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, native policy:UNKNOWN) 3XMTHREADINFO2 (native stack address range from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) 3XMCPUTIME CPU usage total: 42379.604203548 secs 3XMHEAPALLOC Heap bytes allocated since last GC cycle=57280 (0xDFC0) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, entry count: 1) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 2) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) 3XMTHREADINFO "ResourceManager Event Processor" J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, native policy:UNKNOWN) 3XMTHREADINFO2 (native stack address range from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) 3XMCPUTIME CPU usage total: 42996.394528764 secs 3XMHEAPALLOC Heap bytes allocated since last GC cycle=475576 (0x741B8) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code)) 4XESTACKTRACE at java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled Code)) 4XESTACKTRACE at java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, entry count: 1) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 2) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled Code)) 5XESTACKTRACE (entered lock: org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, entry count: 1) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled Code)) 4XESTACKTRACE at org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) Thanks, Kishore