David Maughan created HIVE-15434:
------------------------------------

             Summary: Hive GenericUDF to make uniontype more usable
                 Key: HIVE-15434
                 URL: https://issues.apache.org/jira/browse/HIVE-15434
             Project: Hive
          Issue Type: New Feature
          Components: UDF
    Affects Versions: 2.1.1
            Reporter: David Maughan
            Priority: Minor


h2. Overview

As stated in the documention:

{quote}
UNIONTYPE support is incomplete The UNIONTYPE datatype was introduced in Hive 
0.7.0 (HIVE-537), but full support for this type in Hive remains incomplete. 
Queries that reference UNIONTYPE fields in JOIN (HIVE-2508), WHERE, and GROUP 
BY clauses will fail, and Hive does not define syntax to extract the tag or 
value fields of a UNIONTYPE. This means that UNIONTYPEs are effectively 
look-at-only.
{quote}

It would be useful to have a UDF that allows extraction of values for further 
comparison or transformation.

h2. Proposal

I propose to add a GenericUDF that has 2 modes of operation. Consider the 
following schema and data that contains a union:

Schema:

{code}
struct<field1:uniontype<int,string>>
{code}

Query:

{code}
hive> select field1 from thing;
{0:0}
{1:"one"}
{code}

h4. Explode to Struct

This method will recursively convert all unions within the type to structs with 
fields named {{tag_n}}, {{n}} being the tag number. Only the {{tag_*}} field 
that matches the tag of the union will be populated with the value. In the case 
above the schema of field1 will be converted to:

{code}
struct<tag_0:int,tag_1:string>
{code}

{code}
hive> select extract_union(field1) from thing;
{"tag_0":0,"tag_1":null}
{"tag_0":null,"tag_1":one}
{code}

{code}
hive> select extract_union(field1).tag_0 from thing;
0
null
{code}

h4. Extract the specified tag

This method will simply extract the value of the specified tag. If the tag 
number matches then the value is returned, if it does not, then null is 
returned.

{code}
hive> select extract_union(field1, 0) from thing;
0
null
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to