and my cached RDD is not small. If it was maybe I could materialize and broadcast.
Thanks On Tue, Feb 7, 2017 at 10:28 AM, shyla deshpande <deshpandesh...@gmail.com> wrote: > I have a situation similar to the following and I get SPARK-13758 > <https://issues.apache.org/jira/browse/SPARK-13758>. > > > I understand why I get this error, but I want to know what should be the > approach in dealing with these situations. > > > Thanks > > > > var cached = ssc.sparkContext.parallelize(Seq("apple, banana")) > > val words = ssc.socketTextStream(ip, port).flatMap(_.split(" ")) > > words.foreachRDD((rdd: RDD[String]) => { > > val res = rdd.map(word => (word, word.length)).collect() > > println("words: " + res.mkString(", ")) > > cached = cached.union(rdd) > > cached.checkpoint() > > println("cached words: " + cached.collect.mkString(", ")) > > }) > >