[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025161#comment-15025161
 ] 

Prasanth Jayachandran commented on HIVE-12175:
----------------------------------------------

>From what I understand, class registration in kryo is optional. Registering a 
>class means an unique integer ID is assigned to class. If the class is not 
>registered, then FQCN is written out during serialization. The default 
>serializer for these registered class is FieldSerializer which handles the 
>creation of objects. Default strategy is to invoke the zero arg constructor 
>reflectively. In case of private zero arg constructor reflection trick 
>(setAccessible) is used to create instance. If that fails, then it uses 
>Objenesis's StdInstantiatorStrategy to create object without invoking 
>constructor. 

My understanding is that, when a user add custom jars and make use of their 
custom UDF, the serialization of ExpressionNode will write out the FQCN of the 
user UDF. During deserialization, as long as the UDF is in classpath (which 
will be localized on task nodes) then UDF instance can be created using above 
mentioned strategies using the default FieldSerializer.

There are some instances, where the serializers are not what we expect like 
sql.Date vs util.Date or some instances where objects cannot be created using 
any of the above the strategies. That's when we explicitly register custom 
serializer for specific classes. If user UDFs hits any such cases (ex: 
Arrays.asLists()) and we haven't provided custom serializer then we are in 
trouble. 

> Upgrade Kryo version to 3.0.x
> -----------------------------
>
>                 Key: HIVE-12175
>                 URL: https://issues.apache.org/jira/browse/HIVE-12175
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>    Affects Versions: 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>       at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>       at java.util.AbstractList.add(AbstractList.java:108)
>       at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>       at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>       at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>       at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>       ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to