Because it happens to reference something outside the closures scope that
will reference some other objects (that you don't need) and so one,
resulting in serializing with your task a lot of things that you don't
want. But sure it is discutable and it's more my personal opinion.


2014-04-17 23:28 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:

> Thanks again Eugen! I don't get the point..why you prefer to avoid kyro
> ser for closures?is there any problem with that?
> On Apr 17, 2014 11:10 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote:
>
>> You have two kind of ser : data and closures. They both use java ser.
>> This means that in your function you reference an object outside of it and
>> it is getting ser with your task. To enable kryo ser for closures set
>> spark.closure.serializer property. But usualy I dont as it allows me to
>> detect such unwanted references.
>> Le 17 avr. 2014 22:17, "Flavio Pompermaier" <pomperma...@okkam.it> a
>> écrit :
>>
>>> Now I have another problem..I have to pass one o this non serializable
>>> object to a PairFunction and I received another non serializable
>>> exception..it seems that Kyro doesn't work within Functions. Am I wrong or
>>> this is a limit of Spark?
>>> On Apr 15, 2014 1:36 PM, "Flavio Pompermaier" <pomperma...@okkam.it>
>>> wrote:
>>>
>>>> Ok thanks for the help!
>>>>
>>>> Best,
>>>> Flavio
>>>>
>>>>
>>>> On Tue, Apr 15, 2014 at 12:43 AM, Eugen Cepoi <cepoi.eu...@gmail.com>wrote:
>>>>
>>>>> Nope, those operations are lazy, meaning it will create the RDDs but
>>>>> won't trigger any "action". The computation is launched by operations such
>>>>> as collect, count, save to HDFS etc. And even if they were not lazy, no
>>>>> serialization would happen. Serialization occurs only when data will be
>>>>> transfered (collect, shuffle, maybe perist to disk - but I am not sure for
>>>>> this one).
>>>>>
>>>>>
>>>>> 2014-04-15 0:34 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:
>>>>>
>>>>> Ok, that's fair enough. But why things work up to the collect?during
>>>>>> map and filter objects are not serialized?
>>>>>>  On Apr 15, 2014 12:31 AM, "Eugen Cepoi" <cepoi.eu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sure. As you have pointed, those classes don't implement
>>>>>>> Serializable and Spark uses by default java serialization (when you do
>>>>>>> collect the data from the workers will be serialized, "collected" by the
>>>>>>> driver and then deserialized on the driver side). Kryo (as most other
>>>>>>> decent serialization libs) doesn't require you to implement 
>>>>>>> Serializable.
>>>>>>>
>>>>>>> For the missing attributes it's due to the fact that java
>>>>>>> serialization does not ser/deser attributes from classes that don't 
>>>>>>> impl.
>>>>>>> Serializable (in your case the parent classes).
>>>>>>>
>>>>>>>
>>>>>>> 2014-04-14 23:17 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>
>>>>>>> :
>>>>>>>
>>>>>>>> Thanks Eugen for tgee reply. Could you explain me why I have the
>>>>>>>> problem?Why my serialization doesn't work?
>>>>>>>> On Apr 14, 2014 6:40 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> as a easy workaround you can enable Kryo serialization
>>>>>>>>> http://spark.apache.org/docs/latest/configuration.html
>>>>>>>>>
>>>>>>>>> Eugen
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-04-14 18:21 GMT+02:00 Flavio Pompermaier <
>>>>>>>>> pomperma...@okkam.it>:
>>>>>>>>>
>>>>>>>>>> Hi to all,
>>>>>>>>>>
>>>>>>>>>> in my application I read objects that are not serializable
>>>>>>>>>> because I cannot modify the sources.
>>>>>>>>>> So I tried to do a workaround creating a dummy class that extends
>>>>>>>>>> the unmodifiable one but implements serializable.
>>>>>>>>>> All attributes of the parent class are Lists of objects (some of
>>>>>>>>>> them are still not serializable and some of them are, i.e. 
>>>>>>>>>> List<String>).
>>>>>>>>>>
>>>>>>>>>> Until I do map and filter on the RDD that objects are filled
>>>>>>>>>> correclty (I checked that via Eclipse debug), but when I do collect 
>>>>>>>>>> all the
>>>>>>>>>> attributes of my objects are empty. Could you help me please?
>>>>>>>>>> I'm using spark-core-2.10 e version 0.9.0-incubating.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Flavio
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>

Reply via email to