I am running the Spark shell (1.2.1) in local mode and I have a simple RDD[(String,String,Double)] with about 10,000 objects in it. I get a StackOverFlowError each time I try to run the following code (the code itself is just representative of other logic where I need to pass in a variable). I tried broadcasting the variable too, but no luck .. missing something basic here -
val rdd = sc.makeRDD(List(<Data read from file>) val a=10 rdd.map(r => if (a==10) 1 else 0) This throws - java.lang.StackOverflowError at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) ... ... More experiments .. this works - val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) But below doesn't and throws the StackoverflowError - val lst = MutableList[(String,String,Double)]() Range(0,10000).foreach(i=>lst+=(("10","10",i:Double))) sc.makeRDD(lst).map(i=> if(a==10) 1 else 0) Any help appreciated! Thanks, Ashish -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org