[
https://issues.apache.org/jira/browse/SPARK-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716125#comment-14716125
]
Yin Huai commented on SPARK-10301:
----------------------------------
Seems this one is hard because at the executor side, we are actually using
parquet file's schema to read data and parquet file's schema contains struct
fields that do not appear in the global schema.
For now, the workaround is to enable schema merge (set {{mergeSchema}} to true
when load a parquet dataset), so the global schema is always the superset of
the local schema.
> For struct type, if parquet's global schema has less fields than a file's
> schema, data reading will fail
> --------------------------------------------------------------------------------------------------------
>
> Key: SPARK-10301
> URL: https://issues.apache.org/jira/browse/SPARK-10301
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.0
> Reporter: Yin Huai
> Assignee: Yin Huai
> Priority: Critical
>
> When parquet's global schema has less number of fields than the local schema
> of a file, the data reading path will fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]