Set the same groupId in all the sources using the same topic. Each message will be read just by one of them
Saludos, Gonzalo On Sep 24, 2015 9:59 PM, "Carlos Rojas Matas" <[email protected]> wrote: > Hi Guys! > > Thanks for accepting my request. We're using flume to ingest massive > amount of data from a kafka source and we're not sure about how to > configure a flume cluster with HA. This is a brief: > > 1 - we use kafka to hold intermediate data about our users activity. > 2- we use flume to ingest all that data and send it to avro files in hdfs. > 3- we wan't to have high availability, that is, not a single agent but a > cluster of agents. > 4- the thing is that we cannot have duplicates in the target files. If we > start several agents consuming from the same topic each one of them > potentially could receive the same events, which breaks out the former > constraint. > > Is there a way to configure multiple sources such that Kafka see them as a > single one? > > Thanks in advance, > -carlos. >
