[ 
https://issues.apache.org/jira/browse/SPARK-14204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400668#comment-15400668
 ] 

Apache Spark commented on SPARK-14204:
--------------------------------------

User 'mchalek' has created a pull request for this issue:
https://github.com/apache/spark/pull/14420

> [SQL] Failure to register URL-derived JDBC driver on executors in cluster mode
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-14204
>                 URL: https://issues.apache.org/jira/browse/SPARK-14204
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Kevin McHale
>            Assignee: Kevin McHale
>              Labels: JDBC, SQL
>             Fix For: 1.6.2
>
>
> DataFrameReader JDBC methods throw an IllegalStateException when:
>   1. the JDBC driver is contained in a user-provided jar, and
>   2. the user does not specify which driver to use, but rather allows spark 
> to determine the driver from the JDBC URL.
> This broke some of our database ETL jobs at @premisedata when we upgraded 
> from 1.6.0 to 1.6.1.
> I have tracked the problem down to a regression introduced in the fix for 
> SPARK-12579: 
> https://github.com/apache/spark/commit/7f37c1e45d52b7823d566349e2be21366d73651f#diff-391379a5ec51082e2ae1209db15c02b3R53
> The issue is that DriverRegistry.register is not called on the executors for 
> a JDBC driver that is derived from the JDBC path.
> The problem can be demonstrated within spark-shell, provided you're in 
> cluster mode and you've deployed a JDBC driver (e.g. postgresql.Driver) via 
> the --jars argument:
> {code}
> import 
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils.createConnectionFactory
> val factory = 
> createConnectionFactory("jdbc:postgresql://whatever.you.want/database?user=user&password=password",
>  new java.util.Properties)
> sc.parallelize(1 to 100).foreach { _ => factory() } // throws exception
> {code}
> A sufficient fix is to apply DriverRegistry.register to the `driverClass` 
> variable, rather than to `userSpecifiedDriverClass`, at the code link 
> provided above.  I will submit a PR for this shortly.
> In the meantime, a temporary workaround is to manually specify the JDBC 
> driver class in the Properties object passed to DataFrameReader.jdbc, or in 
> the options used in other entry points, which will force the executors to 
> register the class properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to