[ https://issues.apache.org/jira/browse/SPARK-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust updated SPARK-9272: ------------------------------------ Target Version/s: (was: 1.6.0) > Persist information of individual partitions when persisting partitioned data > source tables to metastore > -------------------------------------------------------------------------------------------------------- > > Key: SPARK-9272 > URL: https://issues.apache.org/jira/browse/SPARK-9272 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.5.0 > Reporter: Cheng Lian > > Currently, when a partitioned data source table is persisted to Hive > metastore, we only persist its partition columns. Information about > individual partitions are not persisted. This forces us to do a partition > discovery before reading a persisted partitioned table, which hurts > performance. > To fix this issue, we may persist partition information into metastore. > Specifically, the format should be compatible with Hive to ensure > interoperability. > One of the approach to collect partition values and partition directory path > for dynamicly partitioned tables is to use accumulators to collect expected > information during the write job. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org