[ 
https://issues.apache.org/jira/browse/SPARK-36163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-36163:
------------------------------------

    Assignee: Ivan

> Propagate correct JDBC properties in JDBC connector provider and add 
> "connectionProvider" option
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36163
>                 URL: https://issues.apache.org/jira/browse/SPARK-36163
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0, 3.1.1, 3.1.2
>            Reporter: Ivan
>            Assignee: Ivan
>            Priority: Major
>
> There are a couple of issues with JDBC connection providers. The first is a 
> bug caused by 
> [https://github.com/apache/spark/commit/c3ce9701b458511255072c72b9b245036fa98653]
>  where we would pass all properties, including JDBC data source keys, to the 
> JDBC driver which results in errors like {{java.sql.SQLException: 
> Unrecognized connection property 'url'}}.
> Connection properties are supposed to only include vendor properties, url 
> config is a JDBC option and should be excluded.
> The fix would be replacing {{jdbcOptions.asProperties.asScala.foreach}} with 
> {{jdbcOptions.asConnectionProperties.asScala.foreach}} which is 
> java.sql.Driver friendly.
>  
> I also investigated the problem with multiple providers and I think there are 
> a couple of oversights in {{ConnectionProvider}} implementation. I think it 
> is missing two things:
>  * Any {{JdbcConnectionProvider}} should take precedence over 
> {{BasicConnectionProvider}}. {{BasicConnectionProvider}} should only be 
> selected if there was no match found when inferring providers that can handle 
> JDBC url.
>  * There is currently no way to select a specific provider that you want, 
> similar to how you can select a JDBC driver. The use case is, for example, 
> having connection providers for two databases that handle the same URL but 
> have slightly different semantics and you want to select one in one case and 
> the other one in others.
>  ** I think the first point could be discarded when the second one is 
> addressed.
> You can technically use {{spark.sql.sources.disabledJdbcConnProviderList}} to 
> exclude ones that don’t need to be included, but I am not quite sure why it 
> was done that way - it is much simpler to allow users to enforce the provider 
> they want.
> This ticket fixes it by adding a {{connectionProvider}} option to the JDBC 
> data source that allows users to select a particular provider when the 
> ambiguity arises.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to