Streaming _from_ cassandra, CassandraInputDStream, is coming BTW https://issues.apache.org/jira/browse/SPARK-6283 <https://issues.apache.org/jira/browse/SPARK-6283> I am working on it now.
Helena @helenaedelson > On Mar 23, 2015, at 5:22 AM, Khanderao Kand Gmail <khanderao.k...@gmail.com> > wrote: > > Akhil > > You are right in tour answer to what Mohit wrote. However what Mohit seems to > be alluring but did not write properly might be different. > > Mohit > > You are wrong in saying "generally" streaming works in HDFS and cassandra . > Streaming typically works with streaming or queing source like Kafka, > kinesis, Twitter, flume, zeroMQ, etc (but can also from HDFS and S3 ) However > , streaming context ( "receiver" wishing the streaming context ) gets > events/messages/records and forms a time window based batch (RDD)- > > So there is a maximum gap of window time from alert message was available to > spark and when the processing happens. I think you meant about this. > > As per spark programming model, RDD is the right way to deal with data. If > you are fine with the minimum delay of say a sec (based on min time window > that dstreaming can support) then what Rohit gave is a right model. > > Khanderao > > On Mar 22, 2015, at 11:39 PM, Akhil Das <ak...@sigmoidanalytics.com > <mailto:ak...@sigmoidanalytics.com>> wrote: > >> What do you mean you can't send it directly from spark workers? Here's a >> simple approach which you could do: >> >> val data = ssc.textFileStream("sigmoid/") >> val dist = data.filter(_.contains("ERROR")).foreachRDD(rdd => >> alert("Errors :" + rdd.count())) >> >> And the alert() function could be anything triggering an email or sending an >> SMS alert. >> >> Thanks >> Best Regards >> >> On Sun, Mar 22, 2015 at 1:52 AM, Mohit Anchlia <mohitanch...@gmail.com >> <mailto:mohitanch...@gmail.com>> wrote: >> Is there a module in spark streaming that lets you listen to the >> alerts/conditions as they happen in the streaming module? Generally spark >> streaming components will execute on large set of clusters like hdfs or >> Cassandra, however when it comes to alerting you generally can't send it >> directly from the spark workers, which means you need a way to listen to the >> alerts. >>