[ https://issues.apache.org/jira/browse/SPARK-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-17071. ---------------------------------- Resolution: Won't Fix I am resolving this. Please refer the discussion in the PR. > Fetch Parquet schema within driver-side when there is single file to touch > without another Spark job > ---------------------------------------------------------------------------------------------------- > > Key: SPARK-17071 > URL: https://issues.apache.org/jira/browse/SPARK-17071 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.0 > Reporter: Hyukjin Kwon > > Currently, it seems always launch a Spark distributed job to fetch and merge > Parquet's schemas. > It seems we don't actually have to run another Spark job even when there is > only a single file to touch (meaning without {{mergeSchema}}) but just fetch > the schema within driver-side just like ORC data source is doing. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org