Thanks Lalit, and Helena, What I'd like to do is manipulate the values within a DStream like this:
DStream.foreachRDD( rdd => { val arr = record.toArray } I'd then like to be able to insert results from the arr back into Cassadnra, after I've manipulated the arr array. However, for all the examples I've seen, inserting into Cassandra is something like: val collection = sc.parralellize(Seq("foo", bar"))) Where "foo" and "bar" could be elements in the arr array. So I would like to know how to insert into Cassandra at the worker level. Best wishes, Harold On Thu, Oct 30, 2014 at 11:48 PM, lalit1303 <la...@sigmoidanalytics.com> wrote: > Hi, > > Since, the cassandra object is not serializable you can't open the > connection on driver level and access the object inside foreachRDD (i.e. at > worker level). > You have to open connection inside foreachRDD only, perform the operation > and then close the connection. > > For example: > > wordCounts.foreachRDD( rdd => { > > val arr = rdd.toArray > > OPEN cassandra connection > store arr > CLOSE cassandra connection > > }) > > > Thanks > > > > ----- > Lalit Yadav > la...@sigmoidanalytics.com > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Manipulating-RDDs-within-a-DStream-tp17740p17800.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >