In your config, the name of the partitioner is missing a T ("Paritioner"),
you should be getting an exception and maybe the sink is reverting to
partition by key:
relay_agent.sinks.activity_kafka_sink.kafka_partitioner.class =
org.apache.kafka.clients.producer.internals.RandomParitionerGonzalo On 7 June 2016 at 00:32, Jason J. W. Williams <[email protected]> wrote: > Hi, > > New to flume and I'm trying to relay log messages received over netcat > source to Kafka sink. > > Everything seems to be fine, except that Flume is acting like it IS > assigning a partition key to the produced messages though none is assigned. > I'd like the messages to be assigned to a random partition, so that > consumers are load balanced. > > * Flume 1.6.0 > * Kafka 0.9.0.1 > > Flume config: > https://gist.github.com/williamsjj/8ae025906955fbc4b5f990e162b75d7c > > Kafka topic config: kafka-topics --zookeeper localhost/kafka --create > --topic activity.history --partitions 20 --replication-factor 1 > > Python consumer program: > https://gist.github.com/williamsjj/9e67287f0154816c3a733a39ad008437 > > Test program (publishes to Flume): > https://gist.github.com/williamsjj/1eb097a187a3edb17ec1a3913e47e58b > > Flume agent listens on 3132tcp for connections, and publishes messages > received to the Kafka activity.history topic. I'm running two instances of > the Python consumer. > > What happens however, is all logs messages get sent to a single Kafka > consumer...if I restart Flume (leave consumers running) and re-run the > test, all messages get published to the other consumer. So it feels like > Flume is assigning a permanent partition key even though one is not defined > (and should therefore be random). > > Any advice is greatly appreciated. > > -J >
