Well, I could be wrong :) The whole process takes hardly 2 min and from my personal experience I prefer to gather data and work by elimination process.
Leave it to Mike on how he want to proceed further. thanks ashish On Thu, Oct 16, 2014 at 2:47 PM, Ahmed Vila <[email protected]> wrote: > Hi Ashish, > > Sorry, but I disagree. > > I would agree with you in case that Mike has developed some custom > implementation for Flume, so he would need to pin point in stack trace. > But, he didn't. He's using quite common setup, so in my opinion it's > either up to a hardware failure, kernel-level malfunction, Flume component > malfunction or he just has too much incoming events for the given setup. > Looking up a stack trace would be an overkill at this point. > > Regards, > Ahmed > > On Thu, Oct 16, 2014 at 10:37 AM, Ashish <[email protected]> wrote: > >> I would start with trying to find which Thread is consuming most CPU. The >> stacktrace shall give you a good hint on the direction to proceed. >> >> Blogged about the process here >> http://www.ashishpaliwal.com/blog/2011/08/finding-java-thread-consuming-high-cpu/ >> >> Hope it help >> ashish >> >> On Wed, Oct 15, 2014 at 9:02 PM, Mike Zupan <[email protected]> >> wrote: >> >>> I’m seeing issues with flume server using very high amounts of CPU. >>> Just wondering if this is a common issue with a file channel. I’m pretty >>> new to flume so sorry if this isn’t enough to debug the issue. >>> >>> Current top looks like >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 8509 root 20 0 22.0g 8.6g 675m S 1109.4 13.7 1682:45 java >>> 8251 root 20 0 21.9g 8.3g 647m S 1083.5 13.2 1476:27 java >>> 7593 root 20 0 12.4g 8.4g 18m S 1007.5 13.4 1866:18 java >>> >>> As you can see we have 3 out of 4 flume servers using 1000% cpu. >>> >>> Details are >>> >>> OS: CentOS 6.5 >>> Java: Oracle "1.7.0_45" >>> Flume: flume-1.4.0.2.1.1.0-385.el6.noarch >>> >>> Our config for the server looks like this >>> >>> ############################################### >>> # Agent configuration for transactional data >>> ############################################### >>> nontx_host07_agent01.sources = avro >>> nontx_host07_agent01.channels = fc >>> nontx_host07_agent01.sinks = hdfs_sink_01 hdfs_sink_02 hdfs_sink_03 >>> hdfs_sink_04 >>> >>> ################################################## >>> # info is published to port 9991 >>> ################################################## >>> nontx_host07_agent01.sources.avro.type = avro >>> nontx_host07_agent01.sources.avro.bind = 0.0.0.0 >>> nontx_host07_agent01.sources.avro.port = 9991 >>> nontx_host07_agent01.sources.avro.threads = 100 >>> nontx_host07_agent01.sources.avro.compression-type = deflate >>> nontx_host07_agent01.sources.avro.interceptors = ts id >>> nontx_host07_agent01.sources.avro.interceptors.ts.type = timestamp >>> nontx_host07_agent01.sources.avro.interceptors.ts.preserveExisting = >>> false >>> nontx_host07_agent01.sources.avro.interceptors.id.type = >>> org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder >>> nontx_host07_agent01.sources.avro.interceptors.id.preserveExisting = true >>> >>> >>> ################################################## >>> # The Channels >>> ################################################## >>> nontx_host07_agent01.channels.fc.type = file >>> nontx_host07_agent01.channels.fc.checkpointDir = >>> /flume/channels/checkpoint/nontx_host07_agent01 >>> nontx_host07_agent01.channels.fc.dataDirs = >>> /flume/channels/data/nontx_host07_agent01 >>> nontx_host07_agent01.channels.fc.capacity = 140000000 >>> nontx_host07_agent01.channels.fc.transactionCapacity = 240000 >>> >>> ################################################## >>> # Sinks >>> ################################################## >>> nontx_host07_agent01.sinks.hdfs_sink_01.type = hdfs >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.path = >>> hdfs://cluster01:8020/flume/%{log_type} >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.filePrefix = >>> flume_nontx_host07_agent01_sink01_%Y%m%d%H >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUsePrefix=_ >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUseSuffix=.tmp >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.fileType = CompressedStream >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.codeC = snappy >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollSize = 0 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollCount = 0 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollInterval = 300 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.idleTimeout = 30 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.timeZone = >>> America/Los_Angeles >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.callTimeout = 30000 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.batchSize = 50000 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.round = true >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundUnit = minute >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundValue = 5 >>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.threadsPoolSize = 2 >>> nontx_host07_agent01.sinks.hdfs_sink_01.serializer = >>> com.manage.flume.serialization.HeaderAndBodyJsonEventSerializer$Builder >>> >>> -- >>> Mike Zupan >>> >>> >> >> >> -- >> thanks >> ashish >> >> Blog: http://www.ashishpaliwal.com/blog >> My Photo Galleries: http://www.pbase.com/ashishpaliwal >> > > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It should > not be copied, disclosed to, retained or used by, any party other than the > intended recipient. Any unauthorised distribution, dissemination or copying > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. > -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal
