We are planning to use Apache Kafka to replace Apache Fume for mostly as
log transport layer.  Please see the attached image which is similar use
case ( and deployment architecture ) at Linkedin (according to
http://sites.computer.org/debull/A12june/pipeline.pdf ).     I have
following questions: 1) We will be creating dynamic topic to publish
messages from  frond end and back-end servers producers.   How can we
discovers new topics so consumer can pull the data from Kafka Broker
clusters to HDFS ? 2) Is there a topic priority available when system in
under heavy load ?  For example,  during the holiday traffic we might get
more traffic which will cause more events to be published...so is there any
way to configure topic have higher priority and should not suffer the rate
of through-put for that particular topic.  3) When using Kafka Mirror Maker
for replicating messages from Local Datacenter to centralized Kafka broker
cluster?  Does it also replicate offset consumed by particular consumer ?
Basically, from the centralized Kafka Brokers,  we wanted to re-read the
message from beginning to input into the hadoop. 5) Also, I would like to
contribute to the Kafka Development so please let me know which dev feature
or bugs we can fix to get started.   I have already joined dev group of
Kafka.
Thanks,Bhavesh

Reply via email to