Skye Wanderman-Milne created HIVE-9303:
------------------------------------------
Summary: Parquet files are written with incorrect definition levels
Key: HIVE-9303
URL: https://issues.apache.org/jira/browse/HIVE-9303
Project: Hive
Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Skye Wanderman-Milne
The definition level, which determines which level of nesting is NULL, appears
to always be n or n-1, where n is the maximum definition level. This means that
only the innermost level of nesting can be NULL. This is only relevant for
Parquet files. For example:
{code:sql}
CREATE TABLE text_tbl (a STRUCT<b:STRUCT<c:INT>>)
STORED AS TEXTFILE;
INSERT OVERWRITE TABLE text_tbl
SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL)
FROM tbl LIMIT 1;
CREATE TABLE parq_tbl
STORED AS PARQUET
AS SELECT * FROM text_tbl;
SELECT * FROM text_tbl;
=> NULL # right
SELECT * FROM parq_tbl;
=> {"b":{"c":null}} # wrong
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)