You should probably add --driver-class-path with the jar as well. In theory --jars should add it to the driver as well but in my experience it does not (I think there was a jira open on it). In any case you can find it in stackoverflow: See http://stackoverflow.com/questions/40995943/connect-to-oracle-db-using-pyspark/41000181#41000181. Another thing you might want to try is adding the driver option to the read. See http://stackoverflow.com/questions/36326066/working-with-jdbc-jar-in-pyspark/36328672#36328672. Assaf
From: Léo Biscassi [mailto:leo.bisca...@gmail.com] Sent: Tuesday, December 27, 2016 2:59 PM To: Mich Talebzadeh; Deepak Sharma Cc: user @spark Subject: Re: Location for the additional jar files in Spark Hi all, I have the same problem with spark 2.0.2. Best regards, On Tue, Dec 27, 2016, 9:40 AM Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: Thanks Deppak but get the same error unfortunately ADD_JARS="/home/hduser/jars/ojdbc6.jar" spark-shell Spark context Web UI available at http://50.140.197.217:4041 Spark context available as 'sc' (master = local[*], app id = local-1482842478988). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77) Type in expressions to have them evaluated. Type :help for more information. scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) warning: there was one deprecation warning; re-run with -deprecation for details HiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@a323a5b<mailto:org.apache.spark.sql.hive.HiveContext@a323a5b> scala> //val sqlContext = new HiveContext(sc) scala> println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ").collect.foreach(println) Started at [27/12/2016 12:41:43.43] scala> // scala> var _ORACLEserver= "jdbc:oracle:thin:@rhes564:1521:mydb12" _ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12 scala> var _username = "scratchpad" _username: String = scratchpad scala> var _password = "oracle" _password: String = oracle scala> // scala> val s = HiveContext.read.format("jdbc").options( | Map("url" -> _ORACLEserver, | "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)", | "partitionColumn" -> "ID", | "lowerBound" -> "1", | "upperBound" -> "100000000", | "numPartitions" -> "10", | "user" -> _username, | "password" -> _password)).load java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:315) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) ... 56 elided Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 27 December 2016 at 11:37, Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote: How about this: ADD_JARS="/home/hduser/jars/ojdbc6.jar" spark-shell Thanks Deepak On Tue, Dec 27, 2016 at 5:04 PM, Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: Ok I tried this but no luck spark-shell --jars /home/hduser/jars/ojdbc6.jar Spark context Web UI available at http://50.140.197.217:4041 Spark context available as 'sc' (master = local[*], app id = local-1482838526271). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.0 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77) Type in expressions to have them evaluated. Type :help for more information. scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) warning: there was one deprecation warning; re-run with -deprecation for details HiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@ad0bb4e<mailto:org.apache.spark.sql.hive.HiveContext@ad0bb4e> scala> //val sqlContext = new HiveContext(sc) scala> println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ").collect.foreach(println) Started at [27/12/2016 11:36:26.26] scala> // scala> var _ORACLEserver= "jdbc:oracle:thin:@rhes564:1521:mydb12" _ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12 scala> var _username = "scratchpad" _username: String = scratchpad scala> var _password = "oracle" _password: String = oracle scala> // scala> val s = HiveContext.read.format("jdbc").options( | Map("url" -> _ORACLEserver, | "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)", | "partitionColumn" -> "ID", | "lowerBound" -> "1", | "upperBound" -> "100000000", | "numPartitions" -> "10", | "user" -> _username, | "password" -> _password)).load java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:315) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) ... 56 elided Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 27 December 2016 at 11:23, Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote: I meant ADD_JARS as you said --jars is not working for you with spark-shell. Thanks Deepak On Tue, Dec 27, 2016 at 4:51 PM, Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: Ok just to be clear do you mean ADD_JARS="~/jars/ojdbc6.jar" spark-shell or spark-shell --jars $ADD_JARS Thanks Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 27 December 2016 at 10:30, Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote: It works for me with spark 1.6 (--jars) Please try this: ADD_JARS="<<PATH_TO_JAR>>" spark-shell Thanks Deepak On Tue, Dec 27, 2016 at 3:49 PM, Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: Thanks. The problem is that with spark-shell --jars does not work! This is Spark 2 accessing Oracle 12c spark-shell --jars /home/hduser/jars/ojdbc6.jar It comes back with java.sql.SQLException: No suitable driver unfortunately and spark-shell uses spark-submit under the bonnet if you look at the shell file "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@" hm Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 27 December 2016 at 09:52, Deepak Sharma <deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote: Hi Mich You can copy the jar to shared location and use --jars command line argument of spark-submit. Who so ever needs access to this jar , can refer to the shared path and access it using --jars argument. Thanks Deepak On Tue, Dec 27, 2016 at 3:03 PM, Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: When one runs in Local mode (one JVM) on an edge host (the host user accesses the cluster), it is possible to put additional jar file say accessing Oracle RDBMS tables in $SPARK_CLASSPATH. This works export SPARK_CLASSPATH=~/user_jars/ojdbc6.jar Normally a group of users can have read access to a shared directory like above and once they log in their shell will invoke an environment file that will have the above classpath plus additional parameters like $JAVA_HOME etc are set up for them. However, if the user chooses to run spark through spark-submit with yarn, then the only way this will work in my research is to add the jar file as follows on every node of Spark cluster in $SPARK_HOME/conf/spark-defaults.conf Add the jar path to the following: spark.executor.extraClassPath /user_jars/ojdbc6.jar Note that setting both spark.executor.extraClassPath and SPARK_CLASSPATH will cause initialisation error ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Found both spark.executor.extraClassPath and SPARK_CLASSPATH. Use only the former. I was wondering if there are other ways of making this work in YARN mode, where every node of cluster will require this JAR file? Thanks -- Thanks Deepak www.bigdatabig.com<http://www.bigdatabig.com> www.keosha.net<http://www.keosha.net> -- Thanks Deepak www.bigdatabig.com<http://www.bigdatabig.com> www.keosha.net<http://www.keosha.net> -- Thanks Deepak www.bigdatabig.com<http://www.bigdatabig.com> www.keosha.net<http://www.keosha.net> -- Thanks Deepak www.bigdatabig.com<http://www.bigdatabig.com> www.keosha.net<http://www.keosha.net> -- Best regards, Léo Biscassi