Ismail, #1 seems fine if there is only 1 partition.
#2, ZK server can be overlayed on any server since it's load is typically low. Jun On Wed, Aug 17, 2011 at 9:15 AM, Ismail Dev <[email protected]>wrote: > Hi Jun, > > Reply to 1. > I have configured a broker with only one partition and have two consumers. > Each consumer have only one thread for message consuming. > If I start the consumers only the first one gets a partition and processes > messages. > For the second consumer I get following warning: > "consumer.ZookeeperConsumerConnector: No broker partions consumed by > consumer thread group1_xxx-xxx-0 for topic test" > After the first consumer dies, the available consumer will be rebalanced > and > the second consumer gets the partition. > Actually this is exactly what we want, a fail-over mechanism. > > Reply to 2. > We need on the broker servers and on the server where the producer runs ZK > servers and they are running in a cluster, right ? > Makes sense to me, thanks. > > Ismail. > > > 2011/8/17 Jun Rao <[email protected]> > > > Ismail, > > > > Most of what you described are reasonable. A few comments: > > 1. If you use 2 consumers in the same group, each of them will only get > > about half of the data from the brokers. So, in 1), if you want to > process > > all data, the second consumer has to process messages too. > > > > 2. Typically, you can overlay ZK server on Kafka brokers. However, you > need > > at least 3 ZK servers. > > > > 3. The minimal number of partitions (in total) is the number of consumer > > threads (in total). > > > > Jun > > > > On Wed, Aug 17, 2011 at 4:52 AM, Ismail Dev <[email protected] > > >wrote: > > > > > Hi all, > > > > > > we are working a project which should collect traces, journal and audit > > > entries > > > produced by an application running on a tomcat server in a central > > > data-store. > > > We are expecting about one million entries per day, about 150 MB data. > > > > > > 1.) > > > The trace entries must be collected in the same order like produced > from > > > the > > > > > > application and we need a failover mechanism. > > > The aimed configuration trace collection would be: > > > - exact one producer on tomcat server creating trace entries > > > - the producer sends the messages always to the same partition/broker > > > - 2 brokers on different physical servers > > > - 2 consumers running on both broker server > > > - the consumers belong to the same group (e.g. 'trace') > > > - just one consumer is processing the messages and the second one is > for > > > failover > > > > > > How should be the zookeper configuration ? > > > - one zookeeper server for each brokers running on the server where > > tomcat > > > server runs > > > - or 2 clustered zookeeper server each running on the brokers physical > > > server > > > > > > Is it a good idea to run the consumers on the same physical server as > the > > > brokers ? > > > > > > Makes this configuration sense ? > > > > > > 2.) > > > For the journal and audit the order of the entries are not important. > So > > > the > > > aimed configuration for these would be: > > > - n producers running on the tomcat server > > > - the producers send the messages randomly to available brokers > > > - at least 2 brokers with m partitions on different physical server > > > - at least 2 consumers running on both broker server with m threads > > > - the consumers belong to different groups (e.g. 'journal' and 'audit') > > > > > > My question here is how to figure out the number of partitions. Are > there > > > any measure values or hints ? > > > > > > Many thanks, > > > Ismail. > > > > > >
