Gengliang Wang created SPARK-23896: -------------------------------------- Summary: Improve PartitioningAwareFileIndex Key: SPARK-23896 URL: https://issues.apache.org/jira/browse/SPARK-23896 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.3.1 Reporter: Gengliang Wang
Currently `PartitioningAwareFileIndex` accepts an optional parameter `userPartitionSchema`. If provided, it will combine the inferred partition schema with the parameter. However, 1. to get the inferred partition schema, we have to create a temporary file index. 2. to get `userPartitionSchema`, we need to combine inferred partition schema with `userSpecifiedSchema` Only after that, a final version of `PartitioningAwareFileIndex` is created. This can be improved by passing `userSpecifiedSchema` to `PartitioningAwareFileIndex`. With the improvement, we can reduce redundant code and avoid parsing the file partition twice. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org