Re: Spark streaming alerting

2015-03-24 Thread Helena Edelson
Streaming _from_ cassandra, CassandraInputDStream, is coming BTW https://issues.apache.org/jira/browse/SPARK-6283 https://issues.apache.org/jira/browse/SPARK-6283 I am working on it now. Helena @helenaedelson On Mar 23, 2015, at 5:22 AM, Khanderao Kand Gmail khanderao.k...@gmail.com wrote:

Re: Spark streaming alerting

2015-03-24 Thread Anwar Rizal
Helena, The CassandraInputDStream sounds interesting. I dont find many things in the jira though. Do you have more details on what it tries to achieve ? Thanks, Anwar. On Tue, Mar 24, 2015 at 2:39 PM, Helena Edelson helena.edel...@datastax.com wrote: Streaming _from_ cassandra,

Re: Spark streaming alerting

2015-03-24 Thread Helena Edelson
I created a jira ticket for my work in both the spark and spark-cassandra-connector JIRAs, I don’t know why you can not see them. Users can stream from any cassandra table, just as one can stream from a Kafka topic; same principle. Helena @helenaedelson On Mar 24, 2015, at 11:29 AM, Anwar

Re: Spark streaming alerting

2015-03-23 Thread Mohit Anchlia
I think I didn't explain myself properly :) What I meant to say was that generally spark worker runs on either on HDFS's data nodes or on Cassandra nodes, which typically is in a private network (protected). When a condition is matched it's difficult to send out the alerts directly from the worker

Re: Spark streaming alerting

2015-03-23 Thread Jeffrey Jedele
What exactly do you mean by alerts? Something specific to your data or general events of the spark cluster? For the first, sth like Akhil suggested should work. For the latter, I would suggest having a log consolidation system like logstash in place and use this to generate alerts. Regards, Jeff

Re: Spark streaming alerting

2015-03-23 Thread Akhil Das
What do you mean you can't send it directly from spark workers? Here's a simple approach which you could do: val data = ssc.textFileStream(sigmoid/) val dist = data.filter(_.contains(ERROR)).foreachRDD(rdd = alert(Errors : + rdd.count())) And the alert() function could be anything

Re: Spark streaming alerting

2015-03-23 Thread Khanderao Kand Gmail
Akhil You are right in tour answer to what Mohit wrote. However what Mohit seems to be alluring but did not write properly might be different. Mohit You are wrong in saying generally streaming works in HDFS and cassandra . Streaming typically works with streaming or queing source like Kafka,

Spark streaming alerting

2015-03-21 Thread Mohit Anchlia
Is there a module in spark streaming that lets you listen to the alerts/conditions as they happen in the streaming module? Generally spark streaming components will execute on large set of clusters like hdfs or Cassandra, however when it comes to alerting you generally can't send it directly from