[jira] [Commented] (SPARK-15777) Catalog federation

Yan (JIRA) Mon, 03 Oct 2016 13:28:37 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543329#comment-15543329
 ]


Yan commented on SPARK-15777:
-----------------------------

1) Currently the rules are applied on a per-session basis. Right, ideally they 
should be applied on a per-query basis. We can modify the design/implementation 
in that direction. Regarding evaluation ordering, item 5) of the "Scopes, 
Limitations and Open Questions" is on this topic. In short, there is an 
ordering between the built-in rules and custom rules, but not among the custom 
rules. The plugin mechanism is for cooperative behavior so the plugged rules 
are expected to be applied against their specific data sources of the plans 
only, probably after some plan rewriting. Once the overall ideas are accepted 
by the community, we will flesh out the design doc and post the implementation 
in a WIP fashion.
2) As mentioned in the doc, this is a not complete design. Hopefully it can lay 
down some basic concepts and principles so future work can be built on top of 
it. For instance, persistent catalog itself could be another major feature but 
it is left out of the scope of this design for now without affecting the 
primary functionalities.
3) 3-level table identifier is now for the name space purpose. Yes, join 
queries against two tables of the same db and table names but with different 
catalog names work well. Arbitrary levels of name spaces are not supported yet .

Thanks for your comments.   

> Catalog federation
> ------------------
>
>                 Key: SPARK-15777
>                 URL: https://issues.apache.org/jira/browse/SPARK-15777
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Reynold Xin
>         Attachments: SparkFederationDesign.pdf
>
>
> This is a ticket to track progress to support federating multiple external 
> catalogs. This would require establishing an API (similar to the current 
> ExternalCatalog API) for getting information about external catalogs, and 
> ability to convert a table into a data source table.
> As part of this, we would also need to be able to support more than a 
> two-level table identifier (database.table). At the very least we would need 
> a three level identifier for tables (catalog.database.table). A possibly 
> direction is to support arbitrary level hierarchical namespaces similar to 
> file systems.
> Once we have this implemented, we can convert the current Hive catalog 
> implementation into an external catalog that is "mounted" into an internal 
> catalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-15777) Catalog federation

Reply via email to