Using Spark shell :

scala> import scala.collection.mutable.MutableList
import scala.collection.mutable.MutableList

scala> val lst = MutableList[(String,String,Double)]()
lst: scala.collection.mutable.MutableList[(String, String, Double)] =
MutableList()

scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))

scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
<console>:27: error: not found: value a
       val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
                                          ^

scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0)
rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at
<console>:27

scala> rdd.count()
...
15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at <console>:30,
took 0.478350 s
res1: Long = 10000

Ashish:
Please refine your example to mimic more closely what your code actually
did.

Thanks

On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <so...@cloudera.com> wrote:

> That can't cause any error, since there is no action in your first
> snippet. Even calling count on the result doesn't cause an error. You
> must be executing something different.
>
> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shro...@gmail.com>
> wrote:
> > I am running the Spark shell (1.2.1) in local mode and I have a simple
> > RDD[(String,String,Double)] with about 10,000 objects in it. I get a
> > StackOverFlowError each time I try to run the following code (the code
> > itself is just representative of other logic where I need to pass in a
> > variable). I tried broadcasting the variable too, but no luck .. missing
> > something basic here -
> >
> > val rdd = sc.makeRDD(List(<Data read from file>)
> > val a=10
> > rdd.map(r => if (a==10) 1 else 0)
> > This throws -
> >
> > java.lang.StackOverflowError
> >     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
> >     at
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
> >     at
> >
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
> >     at
> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> >     at
> >
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> >     at
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
> >     at
> >
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
> >     at
> > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> >     at
> >
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> > ...
> > ...
> >
> > More experiments  .. this works -
> >
> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
> >
> > But below doesn't and throws the StackoverflowError -
> >
> > val lst = MutableList[(String,String,Double)]()
> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
> >
> > Any help appreciated!
> >
> > Thanks,
> > Ashish
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to