Hi, 1) If the source and sinks provided by the community are good enough for you then don't invent yours. I think there are alot of work already done, you can try those before writing your own source/sink.
2) For reliability you should be using file bases channel, which you already plan to. For message failures I guess you need to handle it yourself. You probably need to write some code there in order to catch the failing message and routing them to some other channel, a more lenient one like file or for the case of hbase you could write the whole failing message in a different table. You will also need to deal with duplicate messages if you are planning to use a reliable channel (like file based channel). /Ehsan On Wed, Jan 15, 2014 at 10:43 AM, AnilKumar B <[email protected]> wrote: > I am planning to use file based channels. > > Thanks & Regards, > B Anil Kumar. > > > On Wed, Jan 15, 2014 at 3:12 PM, AnilKumar B <[email protected]>wrote: > >> Hi, >> >> In our pipeline we are thinking of using flume, our data source can be >> either filer or hbase or it can be couchbase also and sink is either >> filer(down stream's) or another hbase cluster(down stream's). >> >> So I need some help in following. >> 1) To handle multiple sources and sinks, do I need to write custom flume >> sink and source? or I should use community's respective source and sinks? >> >> 2) For us, we cannot miss any data, Is there any mechanism in flume to >> handle failed messages, I mean suppose flume failed write the records into >> hbase, how exactly it will takes care? Or should I maintain state of each >> record and based on it's state I am thinking of handling failed messages, >> Is that correct way? I am trying to use zookeeper for state management. So >> just want to know, whether my approach is correct or not. >> >> >> Thanks & Regards, >> B Anil Kumar. >> > > -- *Muhammad Ehsan ul Haque* Klarna AB Norra Stationsgatan 61 SE-113 43 Stockholm Tel: +46 (0)8- 120 120 00 Fax: +46 (0)8- 120 120 99 Web: www.klarna.com
