Re: Kafka Scaling Ideas

Haruki Okada Mon, 21 Dec 2020 00:10:27 -0800

About load test:
I think it'd be better to monitor per-message process latency and estimate
required partition count based on it because it determines the max
throughput per single partition.
- Say you have to process 12 million messages/hour = 3333 messages/sec .
- If you have 7 partitions (thus 7 parallel consumers at maximum), single
consumer should process 3333 / 7 = 476 messages/sec
- It means, process latency per single message should be lower than 2.1
milliseconds (1000 / 476)
  => If you have 14 partitions, it becomes 4.2 milliseconds


So required partition count can be calculated by per-message process
latency. (I think Spring-Kafka can be easily integrated with prometheus so
you can use it to measure that)

About increasing instance count:
- It depends on current system resource usage.
  * If the system resource is not so busy (likely because the consumer just
almost waits DB-write to return), you don't need to increase consumer
instances
  * But I think you should make sure that single consumer instance isn't
assigned multiple partitions to fully parallelize consumption across
partitions. (If I remember correctly, ConcurrentMessageListenerContainer
has a property to configure the concurrency)

2020年12月21日(月) 15:51 Yana K <yanak1...@gmail.com>:

> So as the next step I see to increase the partition of the 2nd topic - do I
> increase the instances of the consumer from that or keep it at 7?
> Anything else (besides researching those libs)?
>
> Are there any good tools for load testing kafka?
>
> On Sun, Dec 20, 2020 at 7:23 PM Haruki Okada <ocadar...@gmail.com> wrote:
>
> > It depends on how you manually commit offsets.
> > Auto-commit does commits offsets in async manner basically, so as long as
> > you do manual-commit in the same way,  there should be no much
> difference.
> >
> > And, generally offset-commit mode doesn't make much difference in
> > performance regardless manual/auto or async/sync unless offset-commit
> > latency takes significant amount in processing time (e.g. you commit
> > offsets synchronously in every poll() loop).
> >
> > 2020年12月21日(月) 11:08 Yana K <yanak1...@gmail.com>:
> >
> > > Thank you so much Marina and Haruka.
> > >
> > > Marina's response:
> > > - When you say " if you are sure there is no room for perf optimization
> > of
> > > the processing itself :" - do you mean code level optimizations? Can
> you
> > > please explain?
> > > - On the second topic you say " I'd say at least 40" - is this based on
> > 12
> > > million records / hour?
> > > -  "if you can change the incoming topic" - I don't think it is
> possible
> > :(
> > > -  "you could artificially achieve the same by adding one more step
> > > (service) in your pipeline" - this is the next thing - but I want to be
> > > sure this will help, given we've to maintain one more layer
> > >
> > > Haruka's response:
> > > - "One possible solution is creating an intermediate topic" - I already
> > did
> > > it
> > > - I'll look at Decaton - thx
> > >
> > > Is there any thoughts on the auto commit vs manual commit - if it can
> > > better the performance while consuming?
> > >
> > > Yana
> > >
> > >
> > >
> > > On Sat, Dec 19, 2020 at 7:01 PM Haruki Okada <ocadar...@gmail.com>
> > wrote:
> > >
> > > > Hi.
> > > >
> > > > Yeah, Spring-Kafka does processing messages sequentially, so the
> > consumer
> > > > throughput would be capped by database latency per single process.
> > > > One possible solution is creating an intermediate topic (or altering
> > > source
> > > > topic) with much more partitions as Marina suggested.
> > > >
> > > > I'd like to suggest another solution, that is multi-threaded
> processing
> > > per
> > > > single partition.
> > > > Decaton (https://github.com/line/decaton) is a library to achieve
> it.
> > > >
> > > > Also confluent has published a blog post about parallel-consumer (
> > > >
> > > >
> > >
> >
> https://www.confluent.io/blog/introducing-confluent-parallel-message-processing-client/
> > > > )
> > > > for that purpose, but it seems it's still in the BETA stage.
> > > >
> > > > 2020年12月20日(日) 11:41 Marina Popova <ppine7...@protonmail.com
> .invalid>:
> > > >
> > > > > The way I see it - you can only do a few things - if you are sure
> > there
> > > > is
> > > > > no room for perf optimization of the processing itself :
> > > > > 1. speed up your processing per consumer thread: which you already
> > > tried
> > > > > by splitting your logic into a 2-step pipeline instead of 1-step,
> and
> > > > > delegating the work of writing to a DB to the second step ( make
> sure
> > > > your
> > > > > second intermediate Kafka topic is created with much more
> partitions
> > to
> > > > be
> > > > > able to parallelize your work much higher - I'd say at least 40)
> > > > > 2. if you can change the incoming topic - I would create it with
> many
> > > > more
> > > > > partitions as well - say at least 40 or so - to parallelize your
> > first
> > > > step
> > > > > service processing more
> > > > > 3. and if you can't increase partitions for the original topic ) -
> > you
> > > > > could artificially achieve the same by adding one more step
> (service)
> > > in
> > > > > your pipeline that would just read data from the original
> 7-partition
> > > > > topic1 and just push it unchanged into a new topic2 with , say 40
> > > > > partitions - and then have your other services pick up from this
> > topic2
> > > > >
> > > > >
> > > > > good luck,
> > > > > Marina
> > > > >
> > > > > Sent with ProtonMail Secure Email.
> > > > >
> > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > > On Saturday, December 19, 2020 6:46 PM, Yana K <
> yanak1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > I am new to the Kafka world and running into this scale problem.
> I
> > > > > thought
> > > > > > of reaching out to the community if someone can help.
> > > > > > So the problem is I am trying to consume from a Kafka topic that
> > can
> > > > > have a
> > > > > > peak of 12 million messages/hour. That topic is not under my
> > control
> > > -
> > > > it
> > > > > > has 7 partitions and sending json payload.
> > > > > > I have written a consumer (I've used Java and Spring-Kafka lib)
> > that
> > > > will
> > > > > > read that data, filter it and then load it into a database. I ran
> > > into
> > > > a
> > > > > > huge consumer lag that would take 10-12hours to catch up. I have
> 7
> > > > > > instances of my application running to match the 7 partitions
> and I
> > > am
> > > > > > using auto commit. Then I thought of splitting the write logic
> to a
> > > > > > separate layer. So now my architecture has a component that reads
> > and
> > > > > > filters and produces the data to an internal topic (I've done 7
> > > > > partitions
> > > > > > but as you see it's under my control). Then a consumer picks up
> > data
> > > > from
> > > > > > that topic and writes it to the database. It's better but still
> it
> > > > takes
> > > > > > 3-5hours for the consumer lag to catch up.
> > > > > > Am I missing something fundamentally? Are there any other ideas
> for
> > > > > > optimization that can help overcome this scale challenge. Any
> > pointer
> > > > and
> > > > > > article will help too.
> > > > > >
> > > > > > Appreciate your help with this.
> > > > > >
> > > > > > Thanks
> > > > > > Yana
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > ========================
> > > > Okada Haruki
> > > > ocadar...@gmail.com
> > > > ========================
> > > >
> > >
> >
> >
> > --
> > ========================
> > Okada Haruki
> > ocadar...@gmail.com
> > ========================
> >
>


-- 
========================
Okada Haruki
ocadar...@gmail.com
========================

Re: Kafka Scaling Ideas

Reply via email to