Hi Brian, We have 5 brokers and ~80 topics. And the total # of partitions is around 7k partitions if not including replicas (So it's close to the limit that Netflix recommends). Most topics have RF as 2. CPU is only around 25% usage. The average consumers for each topic should be around 3-4. Our disk space is the current bottleneck as we have some topics producing relatively large messages, so we have to lower retention for some topics to only 1 hour. When adding our 5th broker, we had trouble to migrate __consumer_offsets topic because of https://issues.apache.org/jira/browse/KAFKA-4362. So __consumer_offsets have to live in the first 4 brokers even we keep adding brokers.
We want to add a new cluster for some specific group of topics which serves large messages and needs a much longer retention. This is also to reduce operational complexity. I am willing to get any suggestions on scaling the current cluster, but also curious to learn how people do topic discovery. On Tue, Dec 6, 2016 at 12:37 PM, Brian Krahmer <bkrah...@krahmer.com> wrote: > You didn't mention anything about your current configuration, just that > you are 'out of resources'. Perhaps you misunderstand how to size your > partitions per topic, and how partition allocation works. If your brokers > are maxed on cpu, and you double the number of brokers but keep the replica > count the same, I would expect cpu usage to nearly get cut in half. How > many brokers do you have, how many topics do you have and how many > partitions per topic do you have? What is your resource utilization for > bandwidth, CPU, and memory? How many average consumers do you have for > each topic? > > brian > > > > On 06.12.2016 21:23, Yifan Ying wrote: > >> Hi Aseem, the concern is to create too many partitions in total in one >> cluster no matter how many brokers I have in this cluster. I think the two >> articles that I mentioned explain why too many partitions in one cluster >> could cause issues. >> >> >> > -- Yifan