[
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961231#comment-13961231
]
Tongjie Chen commented on HIVE-6785:
------------------------------------
When I add a qtest, I realized that this bug is resolved with this patch in
hive-trunk. But it is still a bug in Hive-0.11 now.
Digging a little bit, I found that when Partition SerDe and Table SerDe are
different, hive 0.11 would try to convert object inspector as long as they are
not equals; however, in hive-trunk (0.13 or 0.14), if output ObjectInspector's
all fields are all settable, there is no conversion happening, hence the bug
presented in this jira does not show in hive-trunk any more.
However, I do think that ParquetStringInspector should be subclass of
JavaStringObjectInspector, so that Hive 0.11 would have no problem as well.
related Hive Jiras:
HIVE-5202
HIVE-5394
------------------- HIVE-trunk (0.13, 0.14 etc) code snippet for
ObjectInspectorConverters -------------------------
// 1. If equalsCheck is true and the inputOI is the same as the outputOI OR
// 2. If the outputOI has all fields settable, return it
if ((equalsCheck && inputOI.equals(outputOI)) ||
ObjectInspectorUtils.hasAllFieldsSettable(outputOI,
oiSettableProperties) == true) {
return outputOI;
}
------------------- HIVE-0.11 code snippet for ObjectInspectorConverters
-------------------------
// If the inputOI is the same as the outputOI, just return it
if (inputOI.equals(outputOI)) {
return outputOI;
}
> query fails when partitioned table's table level serde is ParquetHiveSerDe
> and partition level serde is of different SerDe
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
> Issue Type: Bug
> Components: File Formats, Serializers/Deserializers
> Affects Versions: 0.13.0
> Reporter: Tongjie Chen
> Attachments: HIVE-6785.1.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of
> other SerDe, AND if this table has string column[s], hive generates confusing
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException:
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used
> by the table. The reason is when there is SerDe difference between table and
> partition, hive tries to convert objectinspector of two SerDes.
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector
> (newly introduced), neither a subclass of WritableStringObjectInspector nor
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string
> category objector inspector. There is no break statement in STRING case
> statement, hence the following TIMESTAMP case statement is executed,
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector.
> But because constructor of class JavaStringObjectInspector is package scope
> instead of public or protected, we would need to move ParquetStringInspector
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept
> List data, since the corresponding setStructFieldData and create methods
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and
> partition SerDe is something else.
--
This message was sent by Atlassian JIRA
(v6.2#6252)