You have two kind of ser : data and closures. They both use java ser. This
means that in your function you reference an object outside of it and it is
getting ser with your task. To enable kryo ser for closures set
spark.closure.serializer property. But usualy I dont as it allows me to
detect such unwanted references.
Le 17 avr. 2014 22:17, "Flavio Pompermaier" <pomperma...@okkam.it> a écrit :

> Now I have another problem..I have to pass one o this non serializable
> object to a PairFunction and I received another non serializable
> exception..it seems that Kyro doesn't work within Functions. Am I wrong or
> this is a limit of Spark?
> On Apr 15, 2014 1:36 PM, "Flavio Pompermaier" <pomperma...@okkam.it>
> wrote:
>
>> Ok thanks for the help!
>>
>> Best,
>> Flavio
>>
>>
>> On Tue, Apr 15, 2014 at 12:43 AM, Eugen Cepoi <cepoi.eu...@gmail.com>wrote:
>>
>>> Nope, those operations are lazy, meaning it will create the RDDs but
>>> won't trigger any "action". The computation is launched by operations such
>>> as collect, count, save to HDFS etc. And even if they were not lazy, no
>>> serialization would happen. Serialization occurs only when data will be
>>> transfered (collect, shuffle, maybe perist to disk - but I am not sure for
>>> this one).
>>>
>>>
>>> 2014-04-15 0:34 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:
>>>
>>> Ok, that's fair enough. But why things work up to the collect?during map
>>>> and filter objects are not serialized?
>>>>  On Apr 15, 2014 12:31 AM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote:
>>>>
>>>>> Sure. As you have pointed, those classes don't implement Serializable
>>>>> and Spark uses by default java serialization (when you do collect the data
>>>>> from the workers will be serialized, "collected" by the driver and then
>>>>> deserialized on the driver side). Kryo (as most other decent serialization
>>>>> libs) doesn't require you to implement Serializable.
>>>>>
>>>>> For the missing attributes it's due to the fact that java
>>>>> serialization does not ser/deser attributes from classes that don't impl.
>>>>> Serializable (in your case the parent classes).
>>>>>
>>>>>
>>>>> 2014-04-14 23:17 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:
>>>>>
>>>>>> Thanks Eugen for tgee reply. Could you explain me why I have the
>>>>>> problem?Why my serialization doesn't work?
>>>>>> On Apr 14, 2014 6:40 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> as a easy workaround you can enable Kryo serialization
>>>>>>> http://spark.apache.org/docs/latest/configuration.html
>>>>>>>
>>>>>>> Eugen
>>>>>>>
>>>>>>>
>>>>>>> 2014-04-14 18:21 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>
>>>>>>> :
>>>>>>>
>>>>>>>> Hi to all,
>>>>>>>>
>>>>>>>> in my application I read objects that are not serializable because
>>>>>>>> I cannot modify the sources.
>>>>>>>> So I tried to do a workaround creating a dummy class that extends
>>>>>>>> the unmodifiable one but implements serializable.
>>>>>>>> All attributes of the parent class are Lists of objects (some of
>>>>>>>> them are still not serializable and some of them are, i.e. 
>>>>>>>> List<String>).
>>>>>>>>
>>>>>>>> Until I do map and filter on the RDD that objects are filled
>>>>>>>> correclty (I checked that via Eclipse debug), but when I do collect 
>>>>>>>> all the
>>>>>>>> attributes of my objects are empty. Could you help me please?
>>>>>>>> I'm using spark-core-2.10 e version 0.9.0-incubating.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Flavio
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>

Reply via email to