Sebastien, I think you may have an off by 1 error (e.g. batch should be 0-199, not 1-200). Thus you are sending 2 batches each time (one for 0, another for 1-199).
Jason On Thu, Jun 4, 2015 at 1:33 PM, Jiangjie Qin <j...@linkedin.com.invalid> wrote: > From the code you pasted, that is old producer. > The new producer class is org.apache.kafka.clients.producer.KafkaProducer. > > The new producer does not have sticky partition behavior. The default > partitioner use round-robin like way to send non-keyed messages to > partitions. > > Jiangjie (Becket) Qin > > On 6/3/15, 11:35 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv> > wrote: > > >I am using this code (from "org.apache.kafka" % "kafka_2.10" % "0.8.2.0"), > >no idea if it is the old producer or the new one.... > > > >import kafka.producer.Produced > >import kafka.producer.ProducerConfig > >val prodConfig : ProducerConfig = new ProducerConfig(properties) > >val producer : Producer[Integer,String] = new > >Producer[Integer,String](prodConfig) > > > >How can I know which producer I am using? And what is the behavior of the > >new producer? > > > >Thanks, > >Sébastien > > > > > >2015-06-03 20:04 GMT+02:00 Jiangjie Qin <j...@linkedin.com.invalid>: > > > >> > >> Are you using new producer or old producer? > >> The old producer has 10 min sticky partition behavior while the new > >> producer does not. > >> > >> Thanks, > >> > >> Jiangjie (Becket) Qin > >> > >> On 6/2/15, 11:58 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv> > >> wrote: > >> > >> >Hi Jason, > >> > > >> >The default partitioner does not make the job since my producers > >>haven't a > >> >smooth traffic. What I mean is that they can deliver lots of messages > >> >during 10 minutes and less during the next 10 minutes, that is too say > >>the > >> >first partition will have stacked most of the messages of the last 20 > >> >minutes. > >> > > >> >By the way, I don't understand your point about breaking batch into 2 > >> >separate partitions. With that code, I jump to a new partition on > >>message > >> >201, 401, 601, ... with batch size = 200, where is my mistake? > >> > > >> >Thanks for your help, > >> >Sébastien > >> > > >> >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <j...@squareup.com>: > >> > > >> >> Hi Sebastien, > >> >> > >> >> You might just try using the default partitioner (which is random). > >>It > >> >> works by choosing a random partition each time it re-polls the > >>meta-data > >> >> for the topic. By default, this happens every 10 minutes for each > >>topic > >> >> you produce to (so it evenly distributes load at a granularity of 10 > >> >> minutes). This is based on 'topic.metadata.refresh.interval.ms'. > >> >> > >> >> I suspect your code is causing double requests for each batch, if > >>your > >> >> partitioning is actually breaking up your batches into 2 separate > >> >> partitions. Could be an off by 1 error, with your modulo > >>calculation? > >> >> Perhaps you need to use '% 0' instead of '% 1' there? > >> >> > >> >> Jason > >> >> > >> >> > >> >> > >> >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier < > >> >> sebastien.falqu...@teads.tv> wrote: > >> >> > >> >> > Hi guys, > >> >> > > >> >> > I am new to Kafka and I am facing a problem I am not able to sort > >>out. > >> >> > > >> >> > To smooth traffic over all my brokers' partitions, I have coded a > >> >>custom > >> >> > Paritioner for my producers, using a simple round robin algorithm > >>that > >> >> > jumps from a partition to another on every batch of messages > >> >> (corresponding > >> >> > to batch.num.messages value). It looks like that : > >> >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416 > >> >> > > >> >> > With that fix, every partitions are used equally, but the amount of > >> >> > requests from the producers to the brokers have been multiplied by > >>2. > >> >>I > >> >> do > >> >> > not understand since all producers are async with > >> >>batch.num.messages=200 > >> >> > and the amount of messages processed is still the same as before. > >>Why > >> >>do > >> >> > producers need more requests to do the job? As internal traffic is > >>a > >> >>bit > >> >> > critical on our platform, I would really like to reduce producers' > >> >> requests > >> >> > volume if possible. > >> >> > > >> >> > Any idea? Any suggestion? > >> >> > > >> >> > Regards, > >> >> > Sébastien > >> >> > > >> >> > >> > >> > >