[ https://issues.apache.org/jira/browse/SPARK-36163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-36163: ------------------------------------ Assignee: Apache Spark > Propagate correct JDBC properties in JDBC connector provider and add > "connectionProvider" option > ------------------------------------------------------------------------------------------------ > > Key: SPARK-36163 > URL: https://issues.apache.org/jira/browse/SPARK-36163 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.0, 3.1.1, 3.1.2 > Reporter: Ivan > Assignee: Apache Spark > Priority: Major > > There are a couple of issues with JDBC connection providers. The first is a > bug caused by > [https://github.com/apache/spark/commit/c3ce9701b458511255072c72b9b245036fa98653] > where we would pass all properties, including JDBC data source keys, to the > JDBC driver which results in errors like {{java.sql.SQLException: > Unrecognized connection property 'url'}}. > Connection properties are supposed to only include vendor properties, url > config is a JDBC option and should be excluded. > The fix would be replacing {{jdbcOptions.asProperties.asScala.foreach}} with > {{jdbcOptions.asConnectionProperties.asScala.foreach}} which is > java.sql.Driver friendly. > > I also investigated the problem with multiple providers and I think there are > a couple of oversights in {{ConnectionProvider}} implementation. I think it > is missing two things: > * Any {{JdbcConnectionProvider}} should take precedence over > {{BasicConnectionProvider}}. {{BasicConnectionProvider}} should only be > selected if there was no match found when inferring providers that can handle > JDBC url. > * There is currently no way to select a specific provider that you want, > similar to how you can select a JDBC driver. The use case is, for example, > having connection providers for two databases that handle the same URL but > have slightly different semantics and you want to select one in one case and > the other one in others. > ** I think the first point could be discarded when the second one is > addressed. > You can technically use {{spark.sql.sources.disabledJdbcConnProviderList}} to > exclude ones that don’t need to be included, but I am not quite sure why it > was done that way - it is much simpler to allow users to enforce the provider > they want. > This ticket fixes it by adding a {{connectionProvider}} option to the JDBC > data source that allows users to select a particular provider when the > ambiguity arises. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org