Because it happens to reference something outside the closures scope that will reference some other objects (that you don't need) and so one, resulting in serializing with your task a lot of things that you don't want. But sure it is discutable and it's more my personal opinion.
2014-04-17 23:28 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: > Thanks again Eugen! I don't get the point..why you prefer to avoid kyro > ser for closures?is there any problem with that? > On Apr 17, 2014 11:10 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote: > >> You have two kind of ser : data and closures. They both use java ser. >> This means that in your function you reference an object outside of it and >> it is getting ser with your task. To enable kryo ser for closures set >> spark.closure.serializer property. But usualy I dont as it allows me to >> detect such unwanted references. >> Le 17 avr. 2014 22:17, "Flavio Pompermaier" <pomperma...@okkam.it> a >> écrit : >> >>> Now I have another problem..I have to pass one o this non serializable >>> object to a PairFunction and I received another non serializable >>> exception..it seems that Kyro doesn't work within Functions. Am I wrong or >>> this is a limit of Spark? >>> On Apr 15, 2014 1:36 PM, "Flavio Pompermaier" <pomperma...@okkam.it> >>> wrote: >>> >>>> Ok thanks for the help! >>>> >>>> Best, >>>> Flavio >>>> >>>> >>>> On Tue, Apr 15, 2014 at 12:43 AM, Eugen Cepoi <cepoi.eu...@gmail.com>wrote: >>>> >>>>> Nope, those operations are lazy, meaning it will create the RDDs but >>>>> won't trigger any "action". The computation is launched by operations such >>>>> as collect, count, save to HDFS etc. And even if they were not lazy, no >>>>> serialization would happen. Serialization occurs only when data will be >>>>> transfered (collect, shuffle, maybe perist to disk - but I am not sure for >>>>> this one). >>>>> >>>>> >>>>> 2014-04-15 0:34 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>>> >>>>> Ok, that's fair enough. But why things work up to the collect?during >>>>>> map and filter objects are not serialized? >>>>>> On Apr 15, 2014 12:31 AM, "Eugen Cepoi" <cepoi.eu...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Sure. As you have pointed, those classes don't implement >>>>>>> Serializable and Spark uses by default java serialization (when you do >>>>>>> collect the data from the workers will be serialized, "collected" by the >>>>>>> driver and then deserialized on the driver side). Kryo (as most other >>>>>>> decent serialization libs) doesn't require you to implement >>>>>>> Serializable. >>>>>>> >>>>>>> For the missing attributes it's due to the fact that java >>>>>>> serialization does not ser/deser attributes from classes that don't >>>>>>> impl. >>>>>>> Serializable (in your case the parent classes). >>>>>>> >>>>>>> >>>>>>> 2014-04-14 23:17 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it> >>>>>>> : >>>>>>> >>>>>>>> Thanks Eugen for tgee reply. Could you explain me why I have the >>>>>>>> problem?Why my serialization doesn't work? >>>>>>>> On Apr 14, 2014 6:40 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> as a easy workaround you can enable Kryo serialization >>>>>>>>> http://spark.apache.org/docs/latest/configuration.html >>>>>>>>> >>>>>>>>> Eugen >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-04-14 18:21 GMT+02:00 Flavio Pompermaier < >>>>>>>>> pomperma...@okkam.it>: >>>>>>>>> >>>>>>>>>> Hi to all, >>>>>>>>>> >>>>>>>>>> in my application I read objects that are not serializable >>>>>>>>>> because I cannot modify the sources. >>>>>>>>>> So I tried to do a workaround creating a dummy class that extends >>>>>>>>>> the unmodifiable one but implements serializable. >>>>>>>>>> All attributes of the parent class are Lists of objects (some of >>>>>>>>>> them are still not serializable and some of them are, i.e. >>>>>>>>>> List<String>). >>>>>>>>>> >>>>>>>>>> Until I do map and filter on the RDD that objects are filled >>>>>>>>>> correclty (I checked that via Eclipse debug), but when I do collect >>>>>>>>>> all the >>>>>>>>>> attributes of my objects are empty. Could you help me please? >>>>>>>>>> I'm using spark-core-2.10 e version 0.9.0-incubating. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Flavio >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>