[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315960#comment-15315960 ]
Xiao Li commented on SPARK-15691: --------------------------------- For refactoring {{HiveMetastoreCatalog.scala}}, I just finished the design doc. The PDF version is available in the following link: https://www.dropbox.com/s/tsaoq2joegkdh1h/2016.06.05.HiveMetastoreCatalog.scala%20Refactoring.pdf?dl=0 The original Markdown file can be downloaded via https://www.dropbox.com/s/uita63wkdrmuqr2/2016.06.05.HiveMetastoreCatalog.scala%20Refactoring.md?dl=0 Please let me know if this is the right direction and correct me anything is not appropriate. [~rxin] Thank you very much! > Refactor and improve Hive support > --------------------------------- > > Key: SPARK-15691 > URL: https://issues.apache.org/jira/browse/SPARK-15691 > Project: Spark > Issue Type: New Feature > Components: SQL > Reporter: Reynold Xin > > Hive support is important to Spark SQL, as many Spark users use it to read > from Hive. The current architecture is very difficult to maintain, and this > ticket tracks progress towards getting us to a sane state. > A number of things we want to accomplish are: > - Move the Hive specific catalog logic into HiveExternalCatalog. > -- Remove HiveSessionCatalog. All Hive-related stuff should go into > HiveExternalCatalog. This would require moving caching either into > HiveExternalCatalog, or just into SessionCatalog. > -- Move using properties to store data source options into > HiveExternalCatalog. > -- Potentially more. > - Remove HIve's specific ScriptTransform implementation and make it more > general so we can put it in sql/core. > - Implement HiveTableScan (and write path) as a data source, so we don't need > a special planner rule for HiveTableScan. > - Remove HiveSharedState and HiveSessionState. > One thing that is still unclear to me is how to work with Hive UDF support. > We might still need a special planner rule there. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org