[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15564058#comment-15564058 ]
Reynold Xin commented on SPARK-17861: ------------------------------------- cc [~michael] this is the main work I want to get in for 2.1. > Push data source partitions into metastore for catalog tables > ------------------------------------------------------------- > > Key: SPARK-17861 > URL: https://issues.apache.org/jira/browse/SPARK-17861 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Reynold Xin > Priority: Critical > > Initially, Spark SQL does not store any partition information in the catalog > for data source tables, because initially it was designed to work with > arbitrary files. This, however, has a few issues for catalog tables: > 1. Listing partitions for a large table (with millions of partitions) can be > very slow during cold start. > 2. Does not support heterogeneous partition naming schemes. > 3. Cannot leverage pushing partition pruning into the metastore. > This ticket tracks the work required to push the tracking of partitions into > the metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org