Did you try with a different source? Is your sender multithreaded? Sending from a single thread would obviously be slow. How many messages per batch? The bigger your batch is, better your perf will be
On Saturday, November 14, 2015, Hemanth Abbina <[email protected]> wrote: > Thanks Gonzalo. > > > > Yes, it’s a single server. First we would like to confirm the max > throughput by a single server with this configuration. Size of each message > is around 512 bytes. > > > > I have tried with in-memory & null sink too. Performance increased by 50 > requests/sec or so, not beyond that. > > > > In some of the forums, I have seen Flume benchmark of 30K/40K per single > node (I’m not sure about the configurations). So, trying to check the max > throughput by a server. > > > > *From:* Gonzalo Herreros [mailto:[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>] > *Sent:* Saturday, November 14, 2015 2:02 PM > *To:* user <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> > *Subject:* Re: Flume benchmarking with HTTP source & File channel > > > > If that is just with a single server, 600 messages per sec doesn't sound > bad to me. > Depending on the size of each message, it could be the network the > limiting factor. > > I would try with the null sink and in memory channel. If that doesn't > improve things I would say you need more nodes to go beyond that. > > Regards, > Gonzalo > > On Nov 14, 2015 7:40 AM, "Hemanth Abbina" <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > > Hi, > > > > We have been trying to validate & benchmark the Flume performance for our > production use. > > > > We have configured Flume to have HTTP source, File channel & Kafka sink. > > Hardware : 8 Core, 32 GB RAM, CentOS6.5, Disk - 500 GB HDD. > > Flume configuration: > > *svcagent.sources = > http-source > * > > *svcagent.sinks = > kafka-sink1 > * > > *svcagent.channels = file-channel1* > > > > *# HTTP source to read receive events on port 5005* > > *svcagent.sources.http-source.type = > http * > > *svcagent.sources.http-source.channels = > file-channel1 > > > * > > *svcagent.sources.http-source.port = > 5005 * > > *svcagent.sources.http-source.bind = > 10.15.1.31 * > > > > > *svcagent.sources.http-source.selector.type = > multiplexing * > > *svcagent.sources.http-source.selector.header = > archival * > > *svcagent.sources.http-source.selector.mapping.true = > file-channel1 * > > *svcagent.sources.http-source.selector.default = > file-channel1 * > > *#svcagent.sources.http-source.handler > =org.eiq.flume.JSONHandler.HTTPSourceJSONHandler * > > > > > *svcagent.sinks.kafka-sink1.topic = > flume-sink1 * > > *svcagent.sinks.kafka-sink1.brokerList = 10.15.1.32:9092 > <http://10.15.1.32:9092> * > > *svcagent.sinks.kafka-sink1.channel = > file-channel1 * > > *svcagent.sinks.kafka-sink1.batchSize = > 5000 > * > > > > > *svcagent.channels.file-channel1.type = > file * > > *svcagent.channels.file-channel1.checkpointDir=/etc/flume-kafka/checkpoint > * > > *svcagent.channels.file-channel1.dataDirs=/etc/flume-kafka/data > * > > *svcagent.channels.file-channel1.transactionCapacity=10000 > * > > *svcagent.channels.file-channel1.capacity=50000 > * > > *svcagent.channels.file-channel1.checkpointInterval=120000 > * > > *svcagent.channels.file-channel1.checkpointOnClose=true > * > > *svcagent.channels.file-channel1.maxFileSize=536870912 > * > > *svcagent.channels.file-channel1.use-fast-replay=false > * > > > > When we tried to stream HTTP data, from multiple clients (around 40 HTTP > clients), we could get a max processing of 600 requests/sec, and not > beyond that. Increased the XMX setting of Flume to 4096. > > > > Even we have tried with a Null Sink (instead of Kafka sink). Did not get > much performance improvements. So, assuming the blockage is the HTTP source > & File channel. > > > > Could you please suggest any fine tunings to improve the performance of > this setup. > > > > > --regards > > Hemanth > > -- Thanks, Hari
