[ https://issues.apache.org/jira/browse/STORM-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved STORM-2853. --------------------------------- Resolution: Fixed Fix Version/s: 1.0.6 1.1.2 1.2.0 2.0.0 Merged the patch into master, 1.x, 1.1.x, 1.0.x branches. [~vorin], I merged the patch into branches to make sure they're included to current ongoing RCs. Please reopen the issue if the patch doesn't resolve your issue. Thanks in advance! > Deactivated topologies cause high cpu utilization > ------------------------------------------------- > > Key: STORM-2853 > URL: https://issues.apache.org/jira/browse/STORM-2853 > Project: Apache Storm > Issue Type: Bug > Components: storm-core > Affects Versions: 1.1.0 > Reporter: Stuart > Assignee: Jungtaek Lim > Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.2.0, 1.1.2, 1.0.6 > > Attachments: exclamation.zip > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The issue is there is high cpu usage for deactivated apache storm topologies. > I can reliably re-create the issue using the steps below but I haven't > identified the exact cause or a solution yet. > The environment is a storm cluster on which 1 topology is running (The > topology is extremely simple, I used the exclamation example). It is > INACTIVE. Initially there is normal CPU usage. However, when I kill all > topology JVM processes on all supervisors and let Storm restart them again, I > find that some time later (~9 hours) the CPU usage per JVM process rises to > nearly 100%. I have tested an ACTIVE topology and this does not happen with > it. I have also tested more than one topology and observe the same results > when they're in the INACTIVE state. > ***Steps to re-create:*** > 1. Run 1 topology on an Apache Storm cluster > 2. Deactivate it > 3. Kill **all** topology JVM processes on all supervisors (Storm will > restart them) > 4. Observe the CPU usage on Supervisors rise to nearly 100% for all > **INACTIVE** topology JVM processes. > ***Environment*** > Apache Storm 1.1.0 running on 3 VMs (1 nimbus and 2 supervisors). > Cluster Summary: > - Supervisors: 2 > - Used Slots: 2 > - Available Slots: 38 > - Total Slots: 40 > - Executors: 50 > - Tasks: 50 > the topology has 2 workers and 50 executors/tasks (threads). > ***Investigation so far:*** > Apart from being able to reliably re-create the issue, I have identified, for > the affected topology JVM process, the threads using the most CPU. There are > 102 threads total in the process, 97 blocked, 5 IN_NATIVE. The threads using > the most CPU are identical and there are 23 of them (all in BLOCKED state): > Thread 28558: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; > information may be imprecise) > - java.util.concurrent.locks.LockSupport.parkNanos(long) @bci=11, > line=338 (Compiled frame) > - com.lmax.disruptor.MultiProducerSequencer.next(int) @bci=82, line=136 > (Compiled frame) > - com.lmax.disruptor.RingBuffer.next(int) @bci=5, line=260 (Interpreted > frame) > - > org.apache.storm.utils.DisruptorQueue.publishDirect(java.util.ArrayList, > boolean) @bci=18, line=517 (Interpreted frame) > - > org.apache.storm.utils.DisruptorQueue.access$1000(org.apache.storm.utils.DisruptorQueue, > java.util.ArrayList, boolean) @bci=3, line=61 (Interpreted frame) > - > org.apache.storm.utils.DisruptorQueue$ThreadLocalBatcher.flush(boolean) > @bci=50, line=280 (Interpreted frame) > - org.apache.storm.utils.DisruptorQueue$Flusher.run() @bci=55, line=303 > (Interpreted frame) > - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 > (Compiled frame) > - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled > frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1142 (Compiled frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > I identified this thread by using `jstack` to get a thread dump for the > process: > > jstack -F <pid> > jstack<pid>.txt > and `top` to identify the threads within the process using the most CPU: > top -H -p <pid> -- This message was sent by Atlassian JIRA (v7.6.3#76005)