I created a jira ticket for my work in both the spark and spark-cassandra-connector JIRAs, I don’t know why you can not see them. Users can stream from any cassandra table, just as one can stream from a Kafka topic; same principle.
Helena @helenaedelson > On Mar 24, 2015, at 11:29 AM, Anwar Rizal <anriza...@gmail.com> wrote: > > Helena, > > The CassandraInputDStream sounds interesting. I dont find many things in the > jira though. Do you have more details on what it tries to achieve ? > > Thanks, > Anwar. > > On Tue, Mar 24, 2015 at 2:39 PM, Helena Edelson <helena.edel...@datastax.com > <mailto:helena.edel...@datastax.com>> wrote: > Streaming _from_ cassandra, CassandraInputDStream, is coming BTW > https://issues.apache.org/jira/browse/SPARK-6283 > <https://issues.apache.org/jira/browse/SPARK-6283> > I am working on it now. > > Helena > @helenaedelson > >> On Mar 23, 2015, at 5:22 AM, Khanderao Kand Gmail <khanderao.k...@gmail.com >> <mailto:khanderao.k...@gmail.com>> wrote: >> >> Akhil >> >> You are right in tour answer to what Mohit wrote. However what Mohit seems >> to be alluring but did not write properly might be different. >> >> Mohit >> >> You are wrong in saying "generally" streaming works in HDFS and cassandra . >> Streaming typically works with streaming or queing source like Kafka, >> kinesis, Twitter, flume, zeroMQ, etc (but can also from HDFS and S3 ) >> However , streaming context ( "receiver" wishing the streaming context ) >> gets events/messages/records and forms a time window based batch (RDD)- >> >> So there is a maximum gap of window time from alert message was available to >> spark and when the processing happens. I think you meant about this. >> >> As per spark programming model, RDD is the right way to deal with data. If >> you are fine with the minimum delay of say a sec (based on min time window >> that dstreaming can support) then what Rohit gave is a right model. >> >> Khanderao >> >> On Mar 22, 2015, at 11:39 PM, Akhil Das <ak...@sigmoidanalytics.com >> <mailto:ak...@sigmoidanalytics.com>> wrote: >> >>> What do you mean you can't send it directly from spark workers? Here's a >>> simple approach which you could do: >>> >>> val data = ssc.textFileStream("sigmoid/") >>> val dist = data.filter(_.contains("ERROR")).foreachRDD(rdd => >>> alert("Errors :" + rdd.count())) >>> >>> And the alert() function could be anything triggering an email or sending >>> an SMS alert. >>> >>> Thanks >>> Best Regards >>> >>> On Sun, Mar 22, 2015 at 1:52 AM, Mohit Anchlia <mohitanch...@gmail.com >>> <mailto:mohitanch...@gmail.com>> wrote: >>> Is there a module in spark streaming that lets you listen to the >>> alerts/conditions as they happen in the streaming module? Generally spark >>> streaming components will execute on large set of clusters like hdfs or >>> Cassandra, however when it comes to alerting you generally can't send it >>> directly from the spark workers, which means you need a way to listen to >>> the alerts. >>> > >