Ratandeep Ratti created HIVE-19256:
--------------------------------------
Summary: UDF which shapes the input data according to the
specified schema
Key: HIVE-19256
URL: https://issues.apache.org/jira/browse/HIVE-19256
Project: Hive
Issue Type: New Feature
Reporter: Ratandeep Ratti
Assignee: Ratandeep Ratti
We use this UDF a lot in our org. This UDF takes an object and a Hive schema
and make sure the output object matches the schema completely. In some respects
it is similar to {{named
_struct}} UDF which can be used to select columns from a struct, but it is more
general since it can work not only on structs, but all Hive data types (expect
union). Also the schema can provide certain valid type conversions (int ->
double etc)
One scenario where this is quite useful is making sure that the Hive view
created with a specific schema will have columns which will always match that
schema. In Hive today when a view is created, new nested columns from the
underlying table can leak out from the view, even though the user never wanted
this behavior. Note that this leaking of columns is only for nested columns and
not for top level columns, so in that regard this behavior of Hive is
inconsistent.
Sample usage of the UDF
{code}
generic_project(col, "struct<a:array<struct<c:int,d:string>>>") // Returning
data which matches the input schema. Here extra columns which are not part of
the input will be removed
generic_project(col, "struct<a:double>") // If the input column had a struct
with col a as int . It would type cast 'a' to double.
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)