Hi, In our pipeline we are thinking of using flume, our data source can be either filer or hbase or it can be couchbase also and sink is either filer(down stream's) or another hbase cluster(down stream's).
So I need some help in following. 1) To handle multiple sources and sinks, do I need to write custom flume sink and source? or I should use community's respective source and sinks? 2) For us, we cannot miss any data, Is there any mechanism in flume to handle failed messages, I mean suppose flume failed write the records into hbase, how exactly it will takes care? Or should I maintain state of each record and based on it's state I am thinking of handling failed messages, Is that correct way? I am trying to use zookeeper for state management. So just want to know, whether my approach is correct or not. Thanks & Regards, B Anil Kumar.
