Re: Partitioning and scale

2013-05-24 Thread Neha Narkhede
Timothy, Kafka is not designed to support millions of topics. Zookeeper will become a bottleneck, even if you deploy more brokers to get around the # of files issue. In normal cases, it might work just fine with the right sized cluster. However, when there are failures, the time to recovery could

Re: Partitioning and scale

2013-05-23 Thread Timothy Chen
Hi Neha, Not sure if this sounds crazy, but if we'd like to have the events for the same session id go to the same partition one way could be that each session key creates its own topic with single partition, therefore there could be millions of topic with single partition. I wonder what would

Re: Partitioning and scale

2013-05-23 Thread Milind Parikh
Number of files to manage by os, I suppose. Why wouldn't you use consistent hashing with deliberately engineered collisions to generate a limited number of topics / partitions and filter at the consumer level? Regards Milind On May 23, 2013 4:22 PM, Timothy Chen tnac...@gmail.com wrote: Hi

Partitioning and scale

2013-05-22 Thread Timothy Chen
Hi, I'm currently trying to understand how Kafka (0.8) can scale with our usage pattern and how to setup the partitioning. We want to route the same messages belonging to the same id to the same queue, so its consumer will able to consume all the messages of that id. My questions: - From my

Re: Partitioning and scale

2013-05-22 Thread Chris Curtin
Hi Tim, On Wed, May 22, 2013 at 3:25 PM, Timothy Chen tnac...@gmail.com wrote: Hi, I'm currently trying to understand how Kafka (0.8) can scale with our usage pattern and how to setup the partitioning. We want to route the same messages belonging to the same id to the same queue, so its

Re: Partitioning and scale

2013-05-22 Thread Neha Narkhede
- I see that Kafka server.properties allows one to specify the number of partitions it supports. However, when we want to scale I wonder if we add # of partitions or # of brokers, will the same partitioner start distributing the messages to different partitions? And if it does, how can that same

Re: Partitioning and scale

2013-05-22 Thread Timothy Chen
Hi Neha/Chris, Thanks for the reply, so if I set a fixed number of partitions and just add brokers to the broker pool, does it rebalance the load to the new brokers (along with the data)? Tim On Wed, May 22, 2013 at 1:15 PM, Neha Narkhede neha.narkh...@gmail.comwrote: - I see that Kafka

Re: Partitioning and scale

2013-05-22 Thread Neha Narkhede
Not automatically as of today. You have to run the reassign-partitions tool and explicitly move selected partitions to the new brokers. If you use this tool, you can move partitions to the new broker without any downtime. Thanks, Neha On Wed, May 22, 2013 at 2:20 PM, Timothy Chen