Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread aecc
Hello guys, I'm using Spark 1.0.0 and Kryo serialization In the Spark Shell, when I create a class that contains as an attribute the SparkContext, in this way: class AAA(val s: SparkContext) { } val aaa = new AAA(sc) and I execute any action using that attribute like: val myNumber = 5

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread aecc
Marcelo Vanzin wrote Do you expect to be able to use the spark context on the remote task? Not At all, what I want to create is a wrapper of the SparkContext, to be used only on the driver node. I would like to have in this AAA wrapper several attributes, such as the SparkContext and other

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread Marcelo Vanzin
Hello, On Mon, Nov 24, 2014 at 12:07 PM, aecc alessandroa...@gmail.com wrote: This is the stacktrace: org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: $iwC$$iwC$$iwC$$iwC$AAA - field (class

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread aecc
Yes, I'm running this in the Shell. In my compiled Jar it works perfectly, the issue is I need to do this on the shell. Any available workarounds? I checked sqlContext, they use it in the same way I would like to use my class, they make the class Serializable with transient. Does this affects

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread Marcelo Vanzin
On Mon, Nov 24, 2014 at 1:56 PM, aecc alessandroa...@gmail.com wrote: I checked sqlContext, they use it in the same way I would like to use my class, they make the class Serializable with transient. Does this affects somehow the whole pipeline of data moving? I mean, will I get performance

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread aecc
Ok, great, I'm gonna do do it that way, thanks :). However I still don't understand why this object should be serialized and shipped? aaa.s and sc are both the same object org.apache.spark.SparkContext@1f222881 However this : aaa.s.parallelize(1 to 10).filter(_ == myNumber).count Needs to be

Re: Using Spark Context as an attribute of a class cannot be used

2014-11-24 Thread Marcelo Vanzin
That's an interesting question for which I do not know the answer. Probably a question for someone with more knowledge of the internals of the shell interpreter... On Mon, Nov 24, 2014 at 2:19 PM, aecc alessandroa...@gmail.com wrote: Ok, great, I'm gonna do do it that way, thanks :). However I