Hello, I have an issue. I would like to save some data to Cassandra using Spark.
Firstly i have load data from Elasticsearch to Spark then I obtain this : org.elasticsearch.spark.rdd.ScalaEsRDD which contains this kind of information (AU1rN9uN4PGB4YTCSXr7,Map(@timestamp -> 2015-05-19T08:08:41.541Z, @version -> 1, type -> test-xm, loglevel -> INFO, thread -> ajp-crmprod-fr-002%2F10.2.53.38-8009-44, ID_Echange -> 1432022921395, SessionID -> 2188abc692ad1e0b62cbb6de2b875f91, ProcessID -> 1432022920a560000f00000000009212, IP -> 54.72.65.68, proxy -> 54.72.65.68, ContactID -> 2221538663, Login -> 54509705, messageType -> <<) And i have several row like this. I can saveToCassandra in a table which contains (name text , map<text, text>). However I can not do some queries on the map column because cassandra do not do this. So i did something like this : rddvalues.take(200000).foreach( a => { val collection = sc.parallelize(Seq((a.get("timestamp").get, a.getOrElse("proxy",null)))) collection.saveToCassandra("test", "sparkes") } ) And it is working but it is VERY slow. And when i am trying to do this rddvalues.foreach( a => { | val collection = sc.parallelize(Seq((a.get("timestamp").get, a.getOrElse("proxy",null)))) | collection.saveToCassandra("test", "sparkes") | } | ) I got this kind of message org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean(SparkContext.scala:1622) at org.apache.spark.rdd.RDD.foreach(RDD.scala:797) Do you have any idea ? To conclude, I would like to but my map on a cassandra table from my rddvalues org.apache.spark.rdd.RDD[scala.collection.Map[String,Any]] Best regards, -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-with-cassandra-tp22994.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org