I have been using a logstash alternative - fluentd to ingest the data into hdfs.

I had to configure fluentd to not append the data so that spark streaming will 
be able to pick up the new logs.

-Liming


On 2 Feb, 2015, at 6:05 am, NORD SC <jan.algermis...@nordsc.com> wrote:

> Hi,
> 
> I plan to have logstash send log events (as key value pairs) to spark 
> streaming using Spark on Cassandra.
> 
> Being completely fresh to Spark, I have a couple of questions:
> 
> - is that a good idea at all, or would it be better to put e.g. Kafka in 
> between to handle traffic peeks
>  (IOW: how and how well would Spark Streaming handle peeks?)
> 
> - Is there already a logstash-source implementation for Spark Streaming 
> 
> - assuming there is none yet and assuming it is a good idea: I’d dive into 
> writing it myself - what would the core advice be to avoid biginner traps?
> 
> Jan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to