Re: Autonomous Container Scheduler v2 proposal

Dominic Kim Wed, 28 Nov 2018 19:19:36 -0800

Dear Rodric.

We`ve just started benchmarking of the scheduler with a variety of
scenarios.
I will come with more results.

If you ask that because of scalability of Kafka, I have considered Kafka as
a very scalable solution.

Linkedin announced that they are utilizing a huge Kafka cluster.
https://events.static.linuxfound.org/sites/events/files/slides/Kafka%20At%20Scale.pdf

As per the document, they are utilizing 1100+ Kafka brokers, over 31,000
topics, 350,000+ partitions.

And as per my understanding, only a limited number of brokers participate
in message delivery at a time for a topic.
Since partitions are the unit of parallelism, if we have 5 partitions in a
topic, only maximum 5 brokers can deliver the message at the same time.
(Actually, consumers fetch the messages instead brokers deliver them, but
anyway.)

Since it's not required for all brokers to participate in all message
processing, when we need to scale out Kafka, we can just add more brokers.

For example, if we have 5 topics with 5 partitions each in 5 brokers.
There will be 1 partition for each topic in each broker. As more and more
topics are created, there will be more partitions in one broker.
Let's say the maximum number of partitions that one broker properly handles
in time is 10, then if there are 10 topics, message processing will start
becoming slow.
Then we can just add more brokers. If we add 1 broker, it can handle 10
partitions and it is equal to 2 topics in this example.
So whenever 2 new topics are created we can add a broker so that we can
keep the number of partitions that one broker holds same.
If we can keep the number of partitions in one broker at the same level, we
can guarantee the performance of Kafka as one broker will only participate
in the limited number of partition processing.

In this scene, metadata sharing for partitions might be considered as a
bottleneck.
But once partitions are assigned to brokers, it does not change over
time(if we don't explicitly rebalance them).
Then consumers can just statically read the location of partitions and
fetch data from target brokers.

Partitions are evenly distributed among brokers and each consumer will
access to a few of them only at a time, we can achieve massive scalability
with Kafka.

Since this is based on my understanding about Kafka rather than real
experience, if I observe any issues during benchmarking I will share them
as well.

Thanks
Regards
Dominic

2018년 11월 29일 (목) 오전 9:54, Rodric Rabbah <rod...@gmail.com>님이 작성:

> Thanks for sharing Dominic. I'll need to read this slowly to understand it.
> It does look like a bit of the proposal you've discussed in the past with 1
> topic per action name (per namespace). Have you seen this scale in your
> production to thousands of topics?
>
> -r
>
> On Wed, Nov 28, 2018 at 5:59 AM Dominic Kim <style9...@gmail.com> wrote:
>
> > Dear whiskers.
> >
> > I`ve just submitted the proposal for the next version of Autonomous
> > Container Scheduler.
> >
> >
> https://cwiki.apache.org/confluence/display/OPENWHISK/Autonomous+Container+Scheduler+v2
> >
> > We(Naver), already implemented the scheduler based on that proposal and
> > it's almost finished.
> > It's not all-round scheduler but we could observe many performance
> benefits
> > in a few cases.
> >
> > We(Naver) are willing to contribute it and preparing for the
> contribution.
> > I hope the proposal would help you guys understand the implementation.
> >
> > To contribute the scheduler, it is required to contribute SPI for
> > ContainerPool and ContainerProxy first.
> > I will start working on this.
> > And I will describe the details about each component in each subpage.
> > Currently, they are blank.
> >
> > Please be noted again that the scheduler is not for all cases.
> > But it would be useful in many cases until new implementation based on
> the
> > future architecture of OW would be landed.
> >
> >
> https://cwiki.apache.org/confluence/display/OPENWHISK/OpenWhisk+future+architecture?src=contextnavpagetreemode
> >
> > Any comments and feedback would be very grateful.
> >
> > Thanks
> > Regards
> > Dominic
> >
>

Re: Autonomous Container Scheduler v2 proposal

Reply via email to