[jira] [Commented] (PARQUET-113) Clarify parquet-format specification for LIST and MAP structures.

JIRA Wed, 28 Jan 2015 14:12:38 -0800

    [ 
https://issues.apache.org/jira/browse/PARQUET-113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295943#comment-14295943
 ]


Sergio Peña commented on PARQUET-113:
-------------------------------------

Old versions of Hive (<= 0.12) write Map types different than the proposed in 
this document.

{noformat}
optional group m1 (MAP_KEY_VALUE) {
        repeated group map {
                required binary key;
                optional binary key;
        }
}       
{noformat}

The above is an example of the Map schema written by Hive 0.12. When doing an 
upgrade to Hive 0.14, then Hive fails to read the file with the old schema. We 
should be backwards compatible to this schema as well.

> Clarify parquet-format specification for LIST and MAP structures.
> -----------------------------------------------------------------
>
>                 Key: PARQUET-113
>                 URL: https://issues.apache.org/jira/browse/PARQUET-113
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-format, parquet-mr
>            Reporter: Ryan Blue
>            Assignee: Ryan Blue
>
> There are incompatibilities in the way that some parquet object models 
> translate nested structures annotated by LIST and MAP / MAP_KEY_VALUE. We 
> need to define clearly what the structures should look like and how to 
> interpret existing structures, including what must be supported to read 
> current parquet-avro, parquet-thrift, etc. files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PARQUET-113) Clarify parquet-format specification for LIST and MAP structures.

Reply via email to