Some of the folks on this dev list may be aware that I am doing some flume
performance measurements.
Here is some preliminary data:
I initially started with Avro source + FC + 4 HDFS sinks. Measurements
indicated the agent was able to only reach around 20k events per second. I
tried with event sizes of 1kB and 500 bytes.
I replaced the hdfs sinks with null sinks just to narrow down the source of
the bottle neck. For the same reason i replaced the source with an exec
source that which basically in a loop will cat the same 1GB input file many
times.
*SYSTEM STATS:*
There is a single Disk on the machine but the utilization is very low as
can be seen from the *iostat* output below:
avg-cpu: %user %nice %system %iowait %steal %idle
2.37 0.00 0.44 0.04 0.00 97.16
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 95.98 655.31 6603.58 1348373762 13587517606
Top output also shows cpu & memory are not bottleneck:
top - 17:21:57 up 23 days, 19:34, 2 users, load average: 3.44, 3.17, 2.72
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): *5.9%us,* 3.3%sy, 0.0%ni, 90.7%id, 0.2%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 65937984k total, 22648200k used, 43289784k free, 198448k buffers
Swap: 1048568k total, 14268k used, 1034300k free, 19619416k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6255 root 20 0 12.3g 1.4g 125m S *219.4* 2.2 19:57.64 java
*FLUME MEASUREMENTS*
Since there was spare CPU & Mem & Disk available, I ran a 2nd agent and
noticed that it was able to independently deliver approx. 20k events /sec.
With third agent also same perf was observed.
So system does not seem to be bottleneck.
The channel size remains small and steady so the ingestion rate is the
bottleneck not the drain rate.
Varying the batch size on exec source between 20,100, 500 & 1000 yielded
the foll numbers for ingestion rate with event size of 1024bytes:
FC + exec (batch size 20) + 4 null sink = 18k events/sec
FC + exec (batch size 100) + 4 null sink = 24.2k eps
FC + exec (batch size 500) + 4 null sink = 24k eps
FC + exec (batch size 1000) + 4 null sink = 23.2k eps
Just for the heck of it, i replaced FC with MemCh
FC + exec (batch size 1000) + 4 null sink = 123.4k eps
A few runs with Event size of 500 bytes also gave me numbers in the same
ballpark.
Here is my FC config:
nontx_agent01.channels.fc.checkpointDir = /flume/checkpoint/agent1
nontx_agent01.channels.fc.dataDirs = /flume/data/agent1
nontx_agent01.channels.fc.capacity = 140000000
nontx_agent01.channels.fc.transactionCapacity = 240000
In this setup, these numbers appear to be indicating that Events/s seems to
be a primary bottleneck in FC, and not much the event size or batch size or
cpu/disk capacity.
-Roshan
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.