If the object is something like an utility object (say a DB connection handler), I often use:
@transient lazy val someObj = MyFactory.getObj(...) So basically `@transient` tell the closure cleaner don't serialize this, and the `lazy val` allows it to be initiated on each executor upon its first usage (since the class is in your jar so executor should be able to instantiate it). 2015-08-07 17:20 GMT+02:00 Philip Weaver <philip.wea...@gmail.com>: > If the object cannot be serialized, then I don't think broadcast will make > it magically serializable. You can't transfer data structures between nodes > without serializing them somehow. > > On Fri, Aug 7, 2015 at 7:31 AM, Sujit Pal <sujitatgt...@gmail.com> wrote: > >> Hi Hao, >> >> I think sc.broadcast will allow you to broadcast non-serializable >> objects. According to the scaladocs the Broadcast class itself is >> Serializable and it wraps your object, allowing you to get it from the >> Broadcast object using value(). >> >> Not 100% sure though since I haven't tried broadcasting custom objects >> but maybe worth trying unless you have already and failed. >> >> -sujit >> >> >> On Fri, Aug 7, 2015 at 2:39 AM, Hao Ren <inv...@gmail.com> wrote: >> >>> Is there any workaround to distribute non-serializable object for RDD >>> transformation or broadcast variable ? >>> >>> Say I have an object of class C which is not serializable. Class C is in >>> a jar package, I have no control on it. Now I need to distribute it either >>> by rdd transformation or by broadcast. >>> >>> I tried to subclass the class C with Serializable interface. It works >>> for serialization, but deserialization does not work, since there are no >>> parameter-less constructor for the class C and deserialization is broken >>> with an invalid constructor exception. >>> >>> I think it's a common use case. Any help is appreciated. >>> >>> -- >>> Hao Ren >>> >>> Data Engineer @ leboncoin >>> >>> Paris, France >>> >> >> > -- *JU Han* Software Engineer @ Teads.tv +33 0619608888