Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Guillermo Ortiz
Thank you, but I'm just considering a free options. 2014-11-20 7:53 GMT+01:00 Akhil Das : > You can also look at the Amazon's kinesis if you don't want to handle the > pain of maintaining kafka/flume infra. > > Thanks > Best Regards > > On Thu, Nov 20, 2014 at 3:32 AM, Guillermo Ortiz > wrote: >

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Akhil Das
You can also look at the Amazon's kinesis if you don't want to handle the pain of maintaining kafka/flume infra. Thanks Best Regards On Thu, Nov 20, 2014 at 3:32 AM, Guillermo Ortiz wrote: > Thank you for your answer, I don't know if I typed the question > correctly. But your nswer helps me. >

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Guillermo Ortiz
Thank you for your answer, I don't know if I typed the question correctly. But your nswer helps me. I'm going to make the question again for knowing if you understood me. I have this topology: DataSource1, , DataSourceN --> Kafka --> SparkStreaming --> HDFS

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Guillermo Ortiz
Thank you for your answer, I don't know if I typed the question correctly. But your nswer helps me. I'm going to make the question again for knowing if you understood me. I have this topology: DataSource1, , DataSourceN --> Kafka --> SparkStreaming --> HDFS DataSource1, , DataSourceN

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Hari Shreedharan
Btw, if you want to write to Spark Streaming from Flume -- there is a sink (it is a part of Spark, not Flume). See Approach 2 here: http://spark.apache.org/docs/latest/streaming-flume-integration.html On Wed, Nov 19, 2014 at 12:41 PM, Hari Shreedharan < hshreedha...@cloudera.com> wrote: > As of

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Hari Shreedharan
As of now, you can feed Spark Streaming from both kafka and flume. Currently though there is no API to write data back to either of the two directly. I sent a PR which should eventually add something like this: https://github.com/harishreedharan/spark/blob/Kafka-output/external/kafka/src/main/scal