On Mon, Nov 24, 2014 at 1:56 PM, aecc <alessandroa...@gmail.com> wrote:
> I checked sqlContext, they use it in the same way I would like to use my
> class, they make the class Serializable with transient. Does this affects
> somehow the whole pipeline of data moving? I mean, will I get performance
> issues when doing this because now the class will be Serialized for some
> reason that I still don't understand?

If you want to do the same thing, your "AAA" needs to be serializable
and you need to mark all non-serializable fields as "@transient". The
only performance penalty you'll be paying is the serialization /
deserialization of the "AAA" instance, which most probably will be
really small compared to the actual work the task will be doing.

Unless your class is holding a whole lot of data, in which case you
should start thinking about using a broadcast instead.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to