I am running the Spark shell (1.2.1) in local mode and I have a simple
RDD[(String,String,Double)] with about 10,000 objects in it. I get a
StackOverFlowError each time I try to run the following code (the code
itself is just representative of other logic where I need to pass in a
variable). I tried broadcasting the variable too, but no luck .. missing
something basic here -

val rdd = sc.makeRDD(List(<Data read from file>)
val a=10
rdd.map(r => if (a==10) 1 else 0)
This throws -

java.lang.StackOverflowError
    at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
    at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
...
...

More experiments  .. this works -

val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)

But below doesn't and throws the StackoverflowError -

val lst = MutableList[(String,String,Double)]()
Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)

Any help appreciated!

Thanks,
Ashish



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to