[
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918739#action_12918739
]
Namit Jain commented on HIVE-537:
---------------------------------
I will review again: few initial comments:
1. Constants.java is a generated file ? Can you change serde/if/serde.thrift
2. desc extended for create_union is not detailed enough ?
> Hive TypeInfo/ObjectInspector to support union (besides struct, array, and
> map)
> -------------------------------------------------------------------------------
>
> Key: HIVE-537
> URL: https://issues.apache.org/jira/browse/HIVE-537
> Project: Hadoop Hive
> Issue Type: New Feature
> Reporter: Zheng Shao
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.7.0
>
> Attachments: HIVE-537.1.patch, patch-537-1.txt, patch-537-2.txt,
> patch-537-3.txt, patch-537-4.txt, patch-537.txt
>
>
> There are already some cases inside the code that we use heterogeneous data:
> JoinOperator, and UnionOperator (in the sense that different parents can pass
> in records with different ObjectInspectors).
> We currently use Operator's parentID to distinguish that. However that
> approach does not extend to more complex plans that might be needed in the
> future.
> We will support the union type like this:
> {code}
> TypeDefinition:
> type: primitivetype | structtype | arraytype | maptype | uniontype
> uniontype: "union" "<" tag ":" type ("," tag ":" type)* ">"
> Example:
> union<0:int,1:double,2:array<string>,3:struct<a:int,b:string>>
> Example of serialized data format:
> We will first store the tag byte before we serialize the object. On
> deserialization, we will first read out the tag byte, then we know what is
> the current type of the following object, so we can deserialize it
> successfully.
> Interface for ObjectInspector:
> interface UnionObjectInspector {
> /** Returns the array of OIs that are for each of the tags
> */
> ObjectInspector[] getObjectInspectors();
> /** Return the tag of the object.
> */
> byte getTag(Object o);
> /** Return the field based on the tag value associated with the Object.
> */
> Object getField(Object o);
> };
> An example serialization format (Using deliminated format, with ' ' as
> first-level delimitor and '=' as second-level delimitor)
> userid:int,log:union<0:struct<touserid:int,message:string>>,1:string>
> 123 1=login
> 123 0=243=helloworld
> 123 1=logout
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.