As Joey hinted, if the "key" is not specified in the header, we are producing events with null keys, which Kafka sends to the same partition. Reducing the metadata refresh interval is a good work-around. Another trick I use is to use the UUID interceptor to give a random, unique key to each message.
a1.sources.r1.interceptors = i1 a1.sources.r1.interceptors.i1.type = org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder a1.sources.r1.interceptors.i1.headerName = key Gwen On Mon, Oct 27, 2014 at 8:28 AM, Joey Echeverria <[email protected]> wrote: > Did you set the 'key' header in the FlumeEvent? > > — > Joey Echeverria > > > On Mon, Oct 27, 2014 at 1:36 AM, Mongeol Heo <[email protected]> wrote: >> >> Hello, Gwen. >> >> I have tried flafka, and used it as a sink for producing logs to kafka >> cluster. >> It produced 1M logs in few seconds which is pretty fast. >> However, I have a problem here. >> It produced all logs into one partition, even if there are three >> partitions. >> What I want is that 1M logs are well distributed through the three >> partitions, so I can have the benefit of using three consumers to improve >> the consume speed. >> Is there any way to get what I want? >> I am going to try 'topic.metadata.refresh.interval.ms' option, but I >> believe it will not work as I expect. >> >> Thanks. >> >> On Sat, Oct 25, 2014 at 2:50 AM, Gwen Shapira <[email protected]> >> wrote: >>> >>> Its not in the user guide because the user guide refers to releases >>> and Flafka is only on the unreleased trunk right now. >>> >>> On Fri, Oct 24, 2014 at 1:08 AM, Mongeol Heo <[email protected]> >>> wrote: >>> > I wonder why flume 1.5.0.1 user guide, which shows below, does not >>> > include >>> > it although the link you give is for 1.5.0, and I think this is the one >>> > which is mentioned by Gwen Shapira above. >>> > Thank you. >>> > >>> > http://flume.apache.org/FlumeUserGuide.html >>> > >>> > On Fri, Oct 24, 2014 at 4:55 PM, Hari Shreedharan >>> > <[email protected]> wrote: >>> >> >>> >> >>> >> >>> >> http://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#kafka-source >>> >> >>> >> This will come in the next Apache release. >>> >> >>> >> >>> >> On Fri, Oct 24, 2014 at 12:28 AM, Mongeol Heo <[email protected]> >>> >> wrote: >>> >>> >>> >>> Hi, Hari. >>> >>> Could you point out one of the kafka source which supports version >>> >>> 0.8+? >>> >>> since all I found are for 0.7+. >>> >>> Thanks, >>> >>> >>> >>> Mungeol >>> >>> >>> >>> On Fri, Oct 24, 2014 at 4:15 PM, Hari Shreedharan >>> >>> <[email protected]> wrote: >>> >>>> >>> >>>> There is a Kafka Source too, but both are not yet released in a >>> >>>> Flume >>> >>>> release, though one will hopefully be. >>> >>>> >>> >>>> Thanks, >>> >>>> Hari >>> >>>> >>> >>>> >>> >>>> On Thu, Oct 23, 2014 at 11:53 PM, Mangtani, Kushal >>> >>>> <[email protected]> wrote: >>> >>>>> >>> >>>>> Yes, Flume has support for Kafka 0.8.1. Refer the below link for >>> >>>>> Flume >>> >>>>> Kafka sink,source. I have been using this in my prod and it is >>> >>>>> relatively >>> >>>>> stable >>> >>>>> https://github.com/thilinamb/flume-ng-kafka-sink >>> >>>>> >>> >>>>> -Kushal Mangtani >>> >>>>> ________________________________ >>> >>>>> From: Mongeol Heo [[email protected]] >>> >>>>> Sent: Thursday, October 23, 2014 11:38 PM >>> >>>>> To: [email protected] >>> >>>>> Subject: Kafka version >>> >>>>> >>> >>>>> Hi, >>> >>>>> >>> >>>>> My question is that will flume sink and source for apache kafka >>> >>>>> support >>> >>>>> latest version of it? >>> >>>>> As I know, even sink/source for kafka 0.7.2 is not released yet. >>> >>>>> It will be really great if it does. >>> >>>>> Thanks. >>> >>>>> >>> >>>>> Best regards, >>> >>>>> >>> >>>>> - Mungeol >>> >>>> >>> >>>> >>> >>> >>> >> >>> > >> >> >
