[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315964#comment-15315964 ]
Xiao Li commented on SPARK-15691: --------------------------------- In the PDF document, all the underlined text is are hyperlinks that point to the related contents. > Refactor and improve Hive support > --------------------------------- > > Key: SPARK-15691 > URL: https://issues.apache.org/jira/browse/SPARK-15691 > Project: Spark > Issue Type: New Feature > Components: SQL > Reporter: Reynold Xin > > Hive support is important to Spark SQL, as many Spark users use it to read > from Hive. The current architecture is very difficult to maintain, and this > ticket tracks progress towards getting us to a sane state. > A number of things we want to accomplish are: > - Move the Hive specific catalog logic into HiveExternalCatalog. > -- Remove HiveSessionCatalog. All Hive-related stuff should go into > HiveExternalCatalog. This would require moving caching either into > HiveExternalCatalog, or just into SessionCatalog. > -- Move using properties to store data source options into > HiveExternalCatalog. > -- Potentially more. > - Remove HIve's specific ScriptTransform implementation and make it more > general so we can put it in sql/core. > - Implement HiveTableScan (and write path) as a data source, so we don't need > a special planner rule for HiveTableScan. > - Remove HiveSharedState and HiveSessionState. > One thing that is still unclear to me is how to work with Hive UDF support. > We might still need a special planner rule there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org