Not knowing much about Flume any more, this smells like a case of only a
subset of CPU cores being utilized. Top should show this.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Dec 13, 2013 9:22 PM, "Roshan Naik" <[email protected]> wrote:

> Some of the folks on this dev list may be aware that I am doing some  flume
> performance measurements.
>
> Here is some preliminary data:
>
>
> I initially started with Avro source + FC + 4 HDFS sinks.  Measurements
> indicated the agent was able to only reach around 20k events per second.  I
> tried with event sizes of 1kB and 500 bytes.
>
>
> I replaced the hdfs sinks with null sinks just to narrow down the source of
> the bottle neck. For the same reason i replaced the source with an exec
> source that which basically in a loop will cat the same 1GB input file many
> times.
>
> *SYSTEM STATS:*
> There is a single Disk on the machine but the utilization is very low as
> can be seen from the *iostat* output below:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            2.37    0.00    0.44    0.04    0.00   97.16
>
> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> sda              95.98       655.31      6603.58 1348373762 13587517606
>
>
> Top output also shows cpu & memory are not bottleneck:
>
> top - 17:21:57 up 23 days, 19:34,  2 users,  load average: 3.44, 3.17, 2.72
> Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
> Cpu(s):  *5.9%us,*  3.3%sy,  0.0%ni, 90.7%id,  0.2%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:  65937984k total, 22648200k used, 43289784k free,   198448k buffers
> Swap:  1048568k total,    14268k used,  1034300k free, 19619416k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>
>  6255 root      20   0 12.3g 1.4g 125m S *219.4*  2.2  19:57.64 java
>
>
>
> *FLUME MEASUREMENTS*
>
> Since there was spare CPU & Mem & Disk available, I ran a 2nd agent and
> noticed that it was able to independently deliver approx. 20k events /sec.
> With third agent also same perf was observed.
> So system does not seem to be bottleneck.
>
> The channel size remains small and steady so the ingestion rate is the
> bottleneck not the drain rate.
>
> Varying the batch size on exec source between 20,100, 500 & 1000  yielded
> the foll numbers for ingestion rate with event size of 1024bytes:
>
> FC + exec (batch size     20) + 4 null sink =  18k events/sec
> FC + exec (batch size   100) + 4 null sink =   24.2k eps
> FC + exec (batch size   500) + 4 null sink =   24k eps
> FC + exec (batch size 1000) + 4 null sink =   23.2k eps
>
> Just for the heck of it, i replaced FC with MemCh
>
> FC + exec (batch size 1000) + 4 null sink =   123.4k eps
>
>
> A few runs with Event size of 500 bytes also gave me numbers in the same
> ballpark.
>
> Here is my FC config:
>
> nontx_agent01.channels.fc.checkpointDir = /flume/checkpoint/agent1
> nontx_agent01.channels.fc.dataDirs = /flume/data/agent1
> nontx_agent01.channels.fc.capacity = 140000000
> nontx_agent01.channels.fc.transactionCapacity = 240000
>
>
> In this setup, these numbers appear to be indicating that Events/s seems to
> be a primary bottleneck in FC, and not much the event size or batch size or
> cpu/disk capacity.
>
>
> -Roshan
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Reply via email to