Hi Guys! Thanks for accepting my request. We're using flume to ingest massive amount of data from a kafka source and we're not sure about how to configure a flume cluster with HA. This is a brief:
1 - we use kafka to hold intermediate data about our users activity. 2- we use flume to ingest all that data and send it to avro files in hdfs. 3- we wan't to have high availability, that is, not a single agent but a cluster of agents. 4- the thing is that we cannot have duplicates in the target files. If we start several agents consuming from the same topic each one of them potentially could receive the same events, which breaks out the former constraint. Is there a way to configure multiple sources such that Kafka see them as a single one? Thanks in advance, -carlos.
