[
https://issues.apache.org/jira/browse/PARQUET-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan Blue resolved PARQUET-110.
-------------------------------
Resolution: Fixed
Fixed by PIG-4219. Thanks, Harsh!
> Some schemas without column projection cause Pig failures
> ---------------------------------------------------------
>
> Key: PARQUET-110
> URL: https://issues.apache.org/jira/browse/PARQUET-110
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Reporter: Ryan Blue
>
> Parquet stores and loads the Pig schema in the Configuration. Along the way,
> Pig changes that Schema:
> {code:java}
> // This schema is converted from Parquet and written in Configuration
> String schemaStr = "my_list: {array: (array_element: (num1: int,num2: int))}";
> // Reparsed using org.apache.pig.impl.util.Utils
> Schema schema = Utils.getSchemaFromString(schemaStr);
> // But no longer matches the original structure
> schema.toString();
> // => {my_list: {array_element: (num1: int,num2: int)}}
> {code}
> Note that the intermediate bag, named either "bag" or "array", is removed
> when Pig reparses the Schema. I can work around this to an extent in the
> Parquet code, but the Pig behavior gets more strange. If there are two of
> these, the second is preserved but renamed to "bag_0". Something funny is
> going on there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)