[ 
https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Blue updated SPARK-23204:
------------------------------
    Description: 
DataSourceV2 is currently only configured with a path, passed in options as 
{{path}}. For many data sources, like JDBC, a table name is more appropriate. I 
propose testing the "location" passed to load(String) and save(String) to see 
if it is a path and if not, parsing it as a table name and passing "database" 
and "table" options to readers and writers.

This also creates a way to pass the table identifier when using DataSourceV2 
tables from SQL. For example, {{SELECT * FROM db.table}} creates an 
{{UnresolvedRelation(db,table)}} that could be resolved using the default 
source, passing the db and table name using the same options. Similarly, we can 
add a table property for the datasource implementation to metastore tables and 
add a rule to convert them to DataSourceV2 relations.

  was:
DataSourceV2 is currently only configured with a path, passed in options as 
{{path}}. For many data sources, like JDBC, a table name is more appropriate. I 
propose testing the "location" passed to load(String) and save(String) to see 
if it is a path and if not, parsing it as a table name and passing "database" 
and "table" options to readers and writers.

This also creates a way to pass the table identifier when using DataSourceV2 
tables from SQL. For example, {{SELECT * FROM db.table}} creates an 
{{UnresolvedRelation(db,table)}} that could be resolved using the default 
source, passing the db and table name using the same options.


> DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-23204
>                 URL: https://issues.apache.org/jira/browse/SPARK-23204
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Ryan Blue
>            Priority: Major
>
> DataSourceV2 is currently only configured with a path, passed in options as 
> {{path}}. For many data sources, like JDBC, a table name is more appropriate. 
> I propose testing the "location" passed to load(String) and save(String) to 
> see if it is a path and if not, parsing it as a table name and passing 
> "database" and "table" options to readers and writers.
> This also creates a way to pass the table identifier when using DataSourceV2 
> tables from SQL. For example, {{SELECT * FROM db.table}} creates an 
> {{UnresolvedRelation(db,table)}} that could be resolved using the default 
> source, passing the db and table name using the same options. Similarly, we 
> can add a table property for the datasource implementation to metastore 
> tables and add a rule to convert them to DataSourceV2 relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to