[ https://issues.apache.org/jira/browse/SPARK-23896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-23896: ------------------------------------ Assignee: Apache Spark > Improve PartitioningAwareFileIndex > ---------------------------------- > > Key: SPARK-23896 > URL: https://issues.apache.org/jira/browse/SPARK-23896 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.1 > Reporter: Gengliang Wang > Assignee: Apache Spark > Priority: Major > > Currently `PartitioningAwareFileIndex` accepts an optional parameter > `userPartitionSchema`. If provided, it will combine the inferred partition > schema with the parameter. > However, > 1. to get the inferred partition schema, we have to create a temporary file > index. > 2. to get `userPartitionSchema`, we need to combine inferred partition > schema with `userSpecifiedSchema` > Only after that, a final version of `PartitioningAwareFileIndex` is created. > > This can be improved by passing `userSpecifiedSchema` to > `PartitioningAwareFileIndex`. > With the improvement, we can reduce redundant code and avoid parsing the file > partition twice. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org