For example this stack trace:
"lifecycleSupervisor-1-2" prio=10 tid=0x00007f89141d8800 nid=0x5ac8 runnable [0x00007f89501ad000] java.lang.Thread.State: RUNNABLE at java.lang.Integer.valueOf(Integer.java:642) at org.apache.flume.channel.file.EventQueueBackingStoreFile.get(EventQueueBackingStoreFile.java:310) at org.apache.flume.channel.file.FlumeEventQueue.get(FlumeEventQueue.java:225) at org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:195) - locked <0x00000006890f68f0> (a org.apache.flume.channel.file.FlumeEventQueue) at org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405) at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328) at org.apache.flume.channel.file.Log.doReplay(Log.java:503) at org.apache.flume.channel.file.Log.replay(Log.java:430) at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302) - locked <0x00000006890ea360> (a org.apache.flume.channel.file.FileChannel) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) - locked <0x00000006890ea360> (a org.apache.flume.channel.file.FileChannel) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) On Tue, Sep 24, 2013 at 4:10 PM, Anat Rozenzon <a...@viber.com> wrote: > After some deeper dive, it seems that the problem is with HashMap usage in > EventQueueBackingStoreFile. > > Almost every time I run jstack the JVM is inside > EventQueueBackingStoreFile.get() doing either HashMap.containsKey() or > Integer.valueOf(). > This is because of overwriteMap is defined as regular HashMap<Integer, > Long>(). > > Does your fix solves this issue? > > I think maybe using a Long[] will be better. > > > On Tue, Sep 24, 2013 at 2:34 PM, Anat Rozenzon <a...@viber.com> wrote: > >> Thanks Hari, great news, I'll be glad to test it. >> >> However, I don't have environment with trunk, any way I can get it >> packaged somehow? >> >> >> On Mon, Sep 23, 2013 at 8:50 PM, Hari Shreedharan < >> hshreedha...@cloudera.com> wrote: >> >>> How many events does the File Channel get every 30 seconds and how many >>> get taken out? This is one of the edge cases of the File Channel I have >>> been working on ironing out. There is a patch on >>> https://issues.apache.org/jira/browse/FLUME-2155 (the >>> FLUME-2155-initial.patch file). If you have data that takes an hour to >>> start, and don't mind testing out this patch (this might be buggy, cause >>> data loss, hangs etc - so testing in prod is not recommended), apply this >>> patch to trunk and test it out, and see if it improves the startup time. >>> >>> >>> Thanks, >>> Hari >>> >>> On Monday, September 23, 2013 at 9:16 AM, Anat Rozenzon wrote: >>> >>> Hi, >>> >>> I have a flume instance that is collecting logs from several flume >>> agents using avro source and file channel. >>> Recently, when I'm restarting the collector it takes about an hour to >>> start listening on the avro port. >>> PSB a jstack entry, any idea why the startup is slow? >>> >>> Thanks >>> Anat >>> >>> "lifecycleSupervisor-1-0" prio=10 tid=0x00007f01505e4800 nid=0x4c78 >>> runnable [0x00007f01441d6000] >>> java.lang.Thread.State: RUNNABLE >>> at >>> org.apache.flume.channel.file.FlumeEventQueue.get(FlumeEventQueue.java:225) >>> at >>> org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:195) >>> - locked <0x0000000689149c30> (a >>> org.apache.flume.channel.file.FlumeEventQueue) >>> at >>> org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405) >>> at >>> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328) >>> at org.apache.flume.channel.file.Log.doReplay(Log.java:503) >>> at org.apache.flume.channel.file.Log.replay(Log.java:430) >>> at >>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302) >>> - locked <0x0000000689145ca8> (a >>> org.apache.flume.channel.file.FileChannel) >>> at >>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >>> - locked <0x0000000689145ca8> (a >>> org.apache.flume.channel.file.FileChannel) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at >>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >>> at >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:724) >>> >>> >>> >> >