Hello, I'm using Flume with Kafka and I don't understand some performance results that I'm getting.
I have a topic with 3 nodes, 6 partitions, replication 2. I'm ingesting messages of 1100bytes each one with a poolDirectory source. I tried with Source-MemoryChannel-KafkaSink and I get about 50Kmessage/second - 54Mb/s in Kafka. If I use Source-KafkaChannel I just got about 1Kmessage/second - 1.2Mb/s in Kafka I thought that I was going to get better performance with the KafkaChannel and I'm getting 50x times better with KafkaSink. The first configuration is agent.sources = seqGenSrc agent.channels = memoryChannel agent.sinks = kafkaSink #Source configuration ... agent.sources.seqGenSrc.channels = memoryChannel agent.sinks.kafkaSink.channel = memoryChannel agent.sinks.kafkaSink.type = org.apache.flume.sink.kafka.KafkaSink agent.sinks.kafkaSink.batchSize = 10000 agent.sinks.kafkaSink.brokerList = ose10kafkaelk:9092,ose11kafkaelk:9092,ose12kafkaelk:9092 agent.sinks.kafkaSink.topic = kafka-topic agent.sinks.kafkaSink.requiredAcks = -1 agent.sinks.kafkaSink.channel = memoryChannel agent.channels.memoryChannel.type = memory agent.channels.memoryChannel.capacity = 100000 agent.channels.memoryChannel.transactionCapacity = 10000 The second is: agent.sources = seqGenSrc agent.channels = kafkaChannel # Describe/configure the source ###Configuration spoolDir source... ... # The channel can be defined as follows. agent.sources.seqGenSrc.channels = kafkaChannel agent.channels.kafkaChannel.type = org.apache.flume.channel.kafka.KafkaChannel agent.channels.kafkaChannel.brokerList=ose10kafkaelk:9092,ose11kafkaelk:9092,ose12kafkaelk:9092 agent.channels.kafkaChannel.topic=kafka-topic3 agent.channels.kafkaChannel.zookeeperConnect=ose10kafkaelk:2181
