>From the code you pasted, that is old producer. The new producer class is org.apache.kafka.clients.producer.KafkaProducer.
The new producer does not have sticky partition behavior. The default partitioner use round-robin like way to send non-keyed messages to partitions. Jiangjie (Becket) Qin On 6/3/15, 11:35 PM, "Sebastien Falquier" <[email protected]> wrote: >I am using this code (from "org.apache.kafka" % "kafka_2.10" % "0.8.2.0"), >no idea if it is the old producer or the new one.... > >import kafka.producer.Produced >import kafka.producer.ProducerConfig >val prodConfig : ProducerConfig = new ProducerConfig(properties) >val producer : Producer[Integer,String] = new >Producer[Integer,String](prodConfig) > >How can I know which producer I am using? And what is the behavior of the >new producer? > >Thanks, >Sébastien > > >2015-06-03 20:04 GMT+02:00 Jiangjie Qin <[email protected]>: > >> >> Are you using new producer or old producer? >> The old producer has 10 min sticky partition behavior while the new >> producer does not. >> >> Thanks, >> >> Jiangjie (Becket) Qin >> >> On 6/2/15, 11:58 PM, "Sebastien Falquier" <[email protected]> >> wrote: >> >> >Hi Jason, >> > >> >The default partitioner does not make the job since my producers >>haven't a >> >smooth traffic. What I mean is that they can deliver lots of messages >> >during 10 minutes and less during the next 10 minutes, that is too say >>the >> >first partition will have stacked most of the messages of the last 20 >> >minutes. >> > >> >By the way, I don't understand your point about breaking batch into 2 >> >separate partitions. With that code, I jump to a new partition on >>message >> >201, 401, 601, ... with batch size = 200, where is my mistake? >> > >> >Thanks for your help, >> >Sébastien >> > >> >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <[email protected]>: >> > >> >> Hi Sebastien, >> >> >> >> You might just try using the default partitioner (which is random). >>It >> >> works by choosing a random partition each time it re-polls the >>meta-data >> >> for the topic. By default, this happens every 10 minutes for each >>topic >> >> you produce to (so it evenly distributes load at a granularity of 10 >> >> minutes). This is based on 'topic.metadata.refresh.interval.ms'. >> >> >> >> I suspect your code is causing double requests for each batch, if >>your >> >> partitioning is actually breaking up your batches into 2 separate >> >> partitions. Could be an off by 1 error, with your modulo >>calculation? >> >> Perhaps you need to use '% 0' instead of '% 1' there? >> >> >> >> Jason >> >> >> >> >> >> >> >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier < >> >> [email protected]> wrote: >> >> >> >> > Hi guys, >> >> > >> >> > I am new to Kafka and I am facing a problem I am not able to sort >>out. >> >> > >> >> > To smooth traffic over all my brokers' partitions, I have coded a >> >>custom >> >> > Paritioner for my producers, using a simple round robin algorithm >>that >> >> > jumps from a partition to another on every batch of messages >> >> (corresponding >> >> > to batch.num.messages value). It looks like that : >> >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416 >> >> > >> >> > With that fix, every partitions are used equally, but the amount of >> >> > requests from the producers to the brokers have been multiplied by >>2. >> >>I >> >> do >> >> > not understand since all producers are async with >> >>batch.num.messages=200 >> >> > and the amount of messages processed is still the same as before. >>Why >> >>do >> >> > producers need more requests to do the job? As internal traffic is >>a >> >>bit >> >> > critical on our platform, I would really like to reduce producers' >> >> requests >> >> > volume if possible. >> >> > >> >> > Any idea? Any suggestion? >> >> > >> >> > Regards, >> >> > Sébastien >> >> > >> >> >> >>
