Hi Jun, Reply to 1. I have configured a broker with only one partition and have two consumers. Each consumer have only one thread for message consuming. If I start the consumers only the first one gets a partition and processes messages. For the second consumer I get following warning: "consumer.ZookeeperConsumerConnector: No broker partions consumed by consumer thread group1_xxx-xxx-0 for topic test" After the first consumer dies, the available consumer will be rebalanced and the second consumer gets the partition. Actually this is exactly what we want, a fail-over mechanism.
Reply to 2. We need on the broker servers and on the server where the producer runs ZK servers and they are running in a cluster, right ? Makes sense to me, thanks. Ismail. 2011/8/17 Jun Rao <[email protected]> > Ismail, > > Most of what you described are reasonable. A few comments: > 1. If you use 2 consumers in the same group, each of them will only get > about half of the data from the brokers. So, in 1), if you want to process > all data, the second consumer has to process messages too. > > 2. Typically, you can overlay ZK server on Kafka brokers. However, you need > at least 3 ZK servers. > > 3. The minimal number of partitions (in total) is the number of consumer > threads (in total). > > Jun > > On Wed, Aug 17, 2011 at 4:52 AM, Ismail Dev <[email protected] > >wrote: > > > Hi all, > > > > we are working a project which should collect traces, journal and audit > > entries > > produced by an application running on a tomcat server in a central > > data-store. > > We are expecting about one million entries per day, about 150 MB data. > > > > 1.) > > The trace entries must be collected in the same order like produced from > > the > > > > application and we need a failover mechanism. > > The aimed configuration trace collection would be: > > - exact one producer on tomcat server creating trace entries > > - the producer sends the messages always to the same partition/broker > > - 2 brokers on different physical servers > > - 2 consumers running on both broker server > > - the consumers belong to the same group (e.g. 'trace') > > - just one consumer is processing the messages and the second one is for > > failover > > > > How should be the zookeper configuration ? > > - one zookeeper server for each brokers running on the server where > tomcat > > server runs > > - or 2 clustered zookeeper server each running on the brokers physical > > server > > > > Is it a good idea to run the consumers on the same physical server as the > > brokers ? > > > > Makes this configuration sense ? > > > > 2.) > > For the journal and audit the order of the entries are not important. So > > the > > aimed configuration for these would be: > > - n producers running on the tomcat server > > - the producers send the messages randomly to available brokers > > - at least 2 brokers with m partitions on different physical server > > - at least 2 consumers running on both broker server with m threads > > - the consumers belong to different groups (e.g. 'journal' and 'audit') > > > > My question here is how to figure out the number of partitions. Are there > > any measure values or hints ? > > > > Many thanks, > > Ismail. > > >
