[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-15691. ---------------------------------- Resolution: Incomplete > Refactor and improve Hive support > --------------------------------- > > Key: SPARK-15691 > URL: https://issues.apache.org/jira/browse/SPARK-15691 > Project: Spark > Issue Type: New Feature > Components: SQL > Reporter: Reynold Xin > Priority: Major > Labels: bulk-closed > > Hive support is important to Spark SQL, as many Spark users use it to read > from Hive. The current architecture is very difficult to maintain, and this > ticket tracks progress towards getting us to a sane state. > A number of things we want to accomplish are: > - Move the Hive specific catalog logic into HiveExternalCatalog. > -- Remove HiveSessionCatalog. All Hive-related stuff should go into > HiveExternalCatalog. This would require moving caching either into > HiveExternalCatalog, or just into SessionCatalog. > -- Move using properties to store data source options into > HiveExternalCatalog (So, for a CatalogTable returned by HiveExternalCatalog, > we do not need to distinguish tables stored in hive formats and data source > tables). > -- Potentially more. > - Remove HIve's specific ScriptTransform implementation and make it more > general so we can put it in sql/core. > - Implement HiveTableScan (and write path) as a data source, so we don't need > a special planner rule for HiveTableScan. > - Remove HiveSharedState and HiveSessionState. > One thing that is still unclear to me is how to work with Hive UDF support. > We might still need a special planner rule there. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org