Re: Kafka Channel vs Kafka Sink?
Yes, you can. I came up with 2 use-cases where the Kafka channel is useful (in addition to the HA aspect of the channel). 1. Receive data from various sources (even Kafka itself) - and modify it using interceptors and write out to Kafka. This would be lower latency than using a channel + sink - and this could be HA if you have multiple Flume agents receiving the data, so a dead Flume agent would not delay your data. 2. Send data from Kafka to HDFS/HBase at low latency. This again, gives the advantage of dead Flume agents not delaying data delivery. One agent dies, another picks up the slack sending data to HDFS/HBase etc. I think the Storm Spout is really not required to write to HDFS unless you have more complex processing required on the events. Thanks, Hari On Fri, Nov 7, 2014 at 8:20 PM, Ashish paliwalash...@gmail.com wrote: Just wondering, can I use Kafka Channel instead of Kafka Sink? Essentially the flow is like. Things are coming from working on https://issues.apache.org/jira/browse/FLUME-1286) Source - Channel - Kafka Sink - Kafka - kafka-Storm spout To me it seems like we can use an Agent with Kafka Channel and without a Sink. Just trying to find out Pro's and Con's of this. I am not using it, just curious after reviewing the patch for Kafka Channel documentation. One thing that I could think of was not being able to use Multiple Sinks to drain events faster. Comments/Suggestions? thanks ashish
Re: Kafka Channel vs Kafka Sink?
Thanks! Storm Spout was more to connect Flume to Storm, rather than writing to HDFS. What I meant was, may be don't need Storm Sink anymore. On Tue, Nov 11, 2014 at 4:45 AM, Hari Shreedharan hshreedha...@cloudera.com wrote: Yes, you can. I came up with 2 use-cases where the Kafka channel is useful (in addition to the HA aspect of the channel). 1. Receive data from various sources (even Kafka itself) - and modify it using interceptors and write out to Kafka. This would be lower latency than using a channel + sink - and this could be HA if you have multiple Flume agents receiving the data, so a dead Flume agent would not delay your data. 2. Send data from Kafka to HDFS/HBase at low latency. This again, gives the advantage of dead Flume agents not delaying data delivery. One agent dies, another picks up the slack sending data to HDFS/HBase etc. I think the Storm Spout is really not required to write to HDFS unless you have more complex processing required on the events. Thanks, Hari On Fri, Nov 7, 2014 at 8:20 PM, Ashish paliwalash...@gmail.com wrote: Just wondering, can I use Kafka Channel instead of Kafka Sink? Essentially the flow is like. Things are coming from working on https://issues.apache.org/jira/browse/FLUME-1286) Source - Channel - Kafka Sink - Kafka - kafka-Storm spout To me it seems like we can use an Agent with Kafka Channel and without a Sink. Just trying to find out Pro's and Con's of this. I am not using it, just curious after reviewing the patch for Kafka Channel documentation. One thing that I could think of was not being able to use Multiple Sinks to drain events faster. Comments/Suggestions? thanks ashish -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal
Kafka Channel vs Kafka Sink?
Just wondering, can I use Kafka Channel instead of Kafka Sink? Essentially the flow is like. Things are coming from working on https://issues.apache.org/jira/browse/FLUME-1286) Source - Channel - Kafka Sink - Kafka - kafka-Storm spout To me it seems like we can use an Agent with Kafka Channel and without a Sink. Just trying to find out Pro's and Con's of this. I am not using it, just curious after reviewing the patch for Kafka Channel documentation. One thing that I could think of was not being able to use Multiple Sinks to drain events faster. Comments/Suggestions? thanks ashish