I see.

What about using the following in place of variable a ?
http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables

Cheers

On Sun, Aug 30, 2015 at 8:54 AM, Ashish Shrowty <ashish.shro...@gmail.com>
wrote:

> @Sean - Agree that there is no action, but I still get the
> stackoverflowerror, its very weird
>
> @Ted - Variable a is just an int - val a = 10 ... The error happens when
> I try to pass a variable into the closure. The example you have above works
> fine since there is no variable being passed into the closure from the
> shell.
>
> -Ashish
>
> On Sun, Aug 30, 2015 at 9:55 AM Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Using Spark shell :
>>
>> scala> import scala.collection.mutable.MutableList
>> import scala.collection.mutable.MutableList
>>
>> scala> val lst = MutableList[(String,String,Double)]()
>> lst: scala.collection.mutable.MutableList[(String, String, Double)] =
>> MutableList()
>>
>> scala> Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>>
>> scala> val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>> <console>:27: error: not found: value a
>>        val rdd=sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>                                           ^
>>
>> scala> val rdd=sc.makeRDD(lst).map(i=> if(i._1==10) 1 else 0)
>> rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at
>> <console>:27
>>
>> scala> rdd.count()
>> ...
>> 15/08/30 06:53:40 INFO DAGScheduler: Job 0 finished: count at
>> <console>:30, took 0.478350 s
>> res1: Long = 10000
>>
>> Ashish:
>> Please refine your example to mimic more closely what your code actually
>> did.
>>
>> Thanks
>>
>> On Sun, Aug 30, 2015 at 12:24 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> That can't cause any error, since there is no action in your first
>>> snippet. Even calling count on the result doesn't cause an error. You
>>> must be executing something different.
>>>
>>> On Sun, Aug 30, 2015 at 4:21 AM, ashrowty <ashish.shro...@gmail.com>
>>> wrote:
>>> > I am running the Spark shell (1.2.1) in local mode and I have a simple
>>> > RDD[(String,String,Double)] with about 10,000 objects in it. I get a
>>> > StackOverFlowError each time I try to run the following code (the code
>>> > itself is just representative of other logic where I need to pass in a
>>> > variable). I tried broadcasting the variable too, but no luck ..
>>> missing
>>> > something basic here -
>>> >
>>> > val rdd = sc.makeRDD(List(<Data read from file>)
>>> > val a=10
>>> > rdd.map(r => if (a==10) 1 else 0)
>>> > This throws -
>>> >
>>> > java.lang.StackOverflowError
>>> >     at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:318)
>>> >     at
>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>> >     at
>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>> >     at
>>> >
>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>> > ...
>>> > ...
>>> >
>>> > More experiments  .. this works -
>>> >
>>> > val lst = Range(0,10000).map(i=>("10","10",i:Double)).toList
>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>> >
>>> > But below doesn't and throws the StackoverflowError -
>>> >
>>> > val lst = MutableList[(String,String,Double)]()
>>> > Range(0,10000).foreach(i=>lst+=(("10","10",i:Double)))
>>> > sc.makeRDD(lst).map(i=> if(a==10) 1 else 0)
>>> >
>>> > Any help appreciated!
>>> >
>>> > Thanks,
>>> > Ashish
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-shell-and-StackOverFlowError-tp24508.html
>>> > Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>

Reply via email to