1. How many ZK nodes in your ensemble?
2. Do you have metrics on how many requests ZK is handling?

On Wed, Jan 3, 2018 at 1:48 PM, Andrey Falko <afa...@salesforce.com> wrote:

> Hi everyone,
>
> We are seeing more and more push from our Kafka users to support well
> more than 10k replicated partitions. We'd ideally like to avoid running
> multiple
> clusters to keep our cluster management and monitoring simple. We started
> testing kafka to see how many replicated partitions it could handle.
>
> We found that, to maintain SLAs of under 50ms for produce latency,
> Kafka starts going downhill at around 9k topics with 5 brokers. Each topic
> is
> replicated 3x in our test. The bottleneck appears to be zookeeper:
> after a certain
> period of time, the number of outstanding requests in ZK spikes up at a
> linear rate. Slowing down the rate at which we create and produce to
> topics,
> improves things, but doing that makes the system tougher to manage and use.
> We are happy to publish our detailed results with reproduction
> steps if anyone is interested.
>
> Has anyone overcome this problem and scaled beyond 9k replicated
> partitions?
> Does anyone have zookeeper tuning suggestions? Is it even the bottleneck?
>
> According to this we should have at most 300 3x replicated per broker:
> https://www.confluent.io/blog/how-to-choose-the-number-of-
> topicspartitions-in-a-kafka-cluster/
> Is anyone doing work to have kafka support more than that?
>
> Best regards,
> Andrey Falko
> Salesforce.com
>



-- 
Ben Wood
Software Engineer - Data Agility
Mesosphere

Reply via email to