Sebastien, I think you may have an off by 1 error (e.g. batch should be
0-199, not 1-200).  Thus you are sending 2 batches each time (one for 0,
another for 1-199).

Jason

On Thu, Jun 4, 2015 at 1:33 PM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> From the code you pasted, that is old producer.
> The new producer class is org.apache.kafka.clients.producer.KafkaProducer.
>
> The new producer does not have sticky partition behavior. The default
> partitioner use round-robin like way to send non-keyed messages to
> partitions.
>
> Jiangjie (Becket) Qin
>
> On 6/3/15, 11:35 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv>
> wrote:
>
> >I am using this code (from "org.apache.kafka" % "kafka_2.10" % "0.8.2.0"),
> >no idea if it is the old producer or the new one....
> >
> >import kafka.producer.Produced
> >import kafka.producer.ProducerConfig
> >val prodConfig : ProducerConfig = new ProducerConfig(properties)
> >val producer : Producer[Integer,String] = new
> >Producer[Integer,String](prodConfig)
> >
> >How can I know which producer I am using? And what is the behavior of the
> >new producer?
> >
> >Thanks,
> >Sébastien
> >
> >
> >2015-06-03 20:04 GMT+02:00 Jiangjie Qin <j...@linkedin.com.invalid>:
> >
> >>
> >> Are you using new producer or old producer?
> >> The old producer has 10 min sticky partition behavior while the new
> >> producer does not.
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On 6/2/15, 11:58 PM, "Sebastien Falquier" <sebastien.falqu...@teads.tv>
> >> wrote:
> >>
> >> >Hi Jason,
> >> >
> >> >The default partitioner does not make the job since my producers
> >>haven't a
> >> >smooth traffic. What I mean is that they can deliver lots of messages
> >> >during 10 minutes and less during the next 10 minutes, that is too say
> >>the
> >> >first partition will have stacked most of the messages of the last 20
> >> >minutes.
> >> >
> >> >By the way, I don't understand your point about breaking batch into 2
> >> >separate partitions. With that code, I jump to a new partition on
> >>message
> >> >201, 401, 601, ... with batch size = 200, where is my mistake?
> >> >
> >> >Thanks for your help,
> >> >Sébastien
> >> >
> >> >2015-06-02 16:55 GMT+02:00 Jason Rosenberg <j...@squareup.com>:
> >> >
> >> >> Hi Sebastien,
> >> >>
> >> >> You might just try using the default partitioner (which is random).
> >>It
> >> >> works by choosing a random partition each time it re-polls the
> >>meta-data
> >> >> for the topic.  By default, this happens every 10 minutes for each
> >>topic
> >> >> you produce to (so it evenly distributes load at a granularity of 10
> >> >> minutes).  This is based on 'topic.metadata.refresh.interval.ms'.
> >> >>
> >> >> I suspect your code is causing double requests for each batch, if
> >>your
> >> >> partitioning is actually breaking up your batches into 2 separate
> >> >> partitions.  Could be an off by 1 error, with your modulo
> >>calculation?
> >> >> Perhaps you need to use '% 0' instead of '% 1' there?
> >> >>
> >> >> Jason
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Jun 2, 2015 at 3:35 AM, Sebastien Falquier <
> >> >> sebastien.falqu...@teads.tv> wrote:
> >> >>
> >> >> > Hi guys,
> >> >> >
> >> >> > I am new to Kafka and I am facing a problem I am not able to sort
> >>out.
> >> >> >
> >> >> > To smooth traffic over all my brokers' partitions, I have coded a
> >> >>custom
> >> >> > Paritioner for my producers, using a simple round robin algorithm
> >>that
> >> >> > jumps from a partition to another on every batch of messages
> >> >> (corresponding
> >> >> > to batch.num.messages value). It looks like that :
> >> >> > https://gist.github.com/sfalquier/4c0c7f36dd96d642b416
> >> >> >
> >> >> > With that fix, every partitions are used equally, but the amount of
> >> >> > requests from the producers to the brokers have been multiplied by
> >>2.
> >> >>I
> >> >> do
> >> >> > not understand since all producers are async with
> >> >>batch.num.messages=200
> >> >> > and the amount of messages processed is still the same as before.
> >>Why
> >> >>do
> >> >> > producers need more requests to do the job? As internal traffic is
> >>a
> >> >>bit
> >> >> > critical on our platform, I would really like to reduce producers'
> >> >> requests
> >> >> > volume if possible.
> >> >> >
> >> >> > Any idea? Any suggestion?
> >> >> >
> >> >> > Regards,
> >> >> > Sébastien
> >> >> >
> >> >>
> >>
> >>
>
>

Reply via email to