Hi,

We have been trying to validate & benchmark the Flume performance for our 
production use.

We have configured Flume to have HTTP source, File channel & Kafka sink.
Hardware : 8 Core, 32 GB RAM, CentOS6.5, Disk - 500 GB HDD.
Flume configuration:
svcagent.sources = http-source
svcagent.sinks = kafka-sink1
svcagent.channels = file-channel1

# HTTP source to read receive events on port 5005
svcagent.sources.http-source.type = http
svcagent.sources.http-source.channels = file-channel1
svcagent.sources.http-source.port = 5005
svcagent.sources.http-source.bind = 10.15.1.31

svcagent.sources.http-source.selector.type = multiplexing
svcagent.sources.http-source.selector.header = archival
svcagent.sources.http-source.selector.mapping.true = file-channel1
svcagent.sources.http-source.selector.default = file-channel1
#svcagent.sources.http-source.handler 
=org.eiq.flume.JSONHandler.HTTPSourceJSONHandler

svcagent.sinks.kafka-sink1.topic = flume-sink1
svcagent.sinks.kafka-sink1.brokerList = 10.15.1.32:9092
svcagent.sinks.kafka-sink1.channel = file-channel1
svcagent.sinks.kafka-sink1.batchSize = 5000

svcagent.channels.file-channel1.type = file
svcagent.channels.file-channel1.checkpointDir=/etc/flume-kafka/checkpoint
svcagent.channels.file-channel1.dataDirs=/etc/flume-kafka/data
svcagent.channels.file-channel1.transactionCapacity=10000
svcagent.channels.file-channel1.capacity=50000
svcagent.channels.file-channel1.checkpointInterval=120000
svcagent.channels.file-channel1.checkpointOnClose=true
svcagent.channels.file-channel1.maxFileSize=536870912
svcagent.channels.file-channel1.use-fast-replay=false

When we tried to stream HTTP data, from multiple clients (around 40 HTTP 
clients), we could get a max processing of 600  requests/sec, and not beyond 
that. Increased the XMX setting of Flume to 4096.

Even we have tried with a Null Sink (instead of Kafka sink). Did not get much 
performance improvements. So, assuming the blockage is the HTTP source & File 
channel.

Could you please suggest any fine tunings to improve the performance of this 
setup.

--regards
Hemanth

Reply via email to