RE: Location for the additional jar files in Spark

Mendelson, Assaf Tue, 27 Dec 2016 05:56:40 -0800

You should probably add --driver-class-path with the jar as well. In theory 
--jars should add it to the driver as well but in my experience it does not (I 
think there was a jira open on it). In any case you can find it in 
stackoverflow: See 
http://stackoverflow.com/questions/40995943/connect-to-oracle-db-using-pyspark/41000181#41000181.
 Another thing you might want to try is adding the driver option to the read. 
See 
http://stackoverflow.com/questions/36326066/working-with-jdbc-jar-in-pyspark/36328672#36328672.
Assaf


From: Léo Biscassi [mailto:leo.bisca...@gmail.com]
Sent: Tuesday, December 27, 2016 2:59 PM
To: Mich Talebzadeh; Deepak Sharma
Cc: user @spark
Subject: Re: Location for the additional jar files in Spark


Hi all,
I have the same problem with spark 2.0.2.

Best regards,

On Tue, Dec 27, 2016, 9:40 AM Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Thanks Deppak

but get the same error unfortunately

ADD_JARS="/home/hduser/jars/ojdbc6.jar" spark-shell
Spark context Web UI available at http://50.140.197.217:4041
Spark context available as 'sc' (master = local[*], app id = 
local-1482842478988).

Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0
      /_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.
scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
warning: there was one deprecation warning; re-run with -deprecation for details
HiveContext: org.apache.spark.sql.hive.HiveContext = 
org.apache.spark.sql.hive.HiveContext@a323a5b<mailto:org.apache.spark.sql.hive.HiveContext@a323a5b>
scala> //val sqlContext = new HiveContext(sc)
scala> println ("\nStarted at"); spark.sql("SELECT 
FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') 
").collect.foreach(println)
Started at
[27/12/2016 12:41:43.43]
scala> //
scala> var _ORACLEserver= "jdbc:oracle:thin:@rhes564:1521:mydb12"
_ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12
scala> var _username = "scratchpad"
_username: String = scratchpad
scala> var _password = "oracle"
_password: String = oracle
scala> //
scala> val s = HiveContext.read.format("jdbc").options(
     | Map("url" -> _ORACLEserver,
     | "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, 
RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)",
     | "partitionColumn" -> "ID",
     | "lowerBound" -> "1",
     | "upperBound" -> "100000000",
     | "numPartitions" -> "10",
     | "user" -> _username,
     | "password" -> _password)).load
java.sql.SQLException: No suitable driver
  at java.sql.DriverManager.getDriver(DriverManager.java:315)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54)
  at scala.Option.getOrElse(Option.scala:121)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53)
  at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:315)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
  ... 56 elided


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 27 December 2016 at 11:37, Deepak Sharma 
<deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote:
How about this:
ADD_JARS="/home/hduser/jars/ojdbc6.jar" spark-shell

Thanks
Deepak

On Tue, Dec 27, 2016 at 5:04 PM, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Ok I tried this but no luck

spark-shell --jars /home/hduser/jars/ojdbc6.jar
Spark context Web UI available at http://50.140.197.217:4041
Spark context available as 'sc' (master = local[*], app id = 
local-1482838526271).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0
      /_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.
scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
warning: there was one deprecation warning; re-run with -deprecation for details
HiveContext: org.apache.spark.sql.hive.HiveContext = 
org.apache.spark.sql.hive.HiveContext@ad0bb4e<mailto:org.apache.spark.sql.hive.HiveContext@ad0bb4e>
scala> //val sqlContext = new HiveContext(sc)
scala> println ("\nStarted at"); spark.sql("SELECT 
FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') 
").collect.foreach(println)
Started at
[27/12/2016 11:36:26.26]
scala> //
scala> var _ORACLEserver= "jdbc:oracle:thin:@rhes564:1521:mydb12"
_ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12
scala> var _username = "scratchpad"
_username: String = scratchpad
scala> var _password = "oracle"
_password: String = oracle
scala> //
scala> val s = HiveContext.read.format("jdbc").options(
     | Map("url" -> _ORACLEserver,
     | "dbtable" -> "(SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, 
RANDOM_STRING, SMALL_VC, PADDING FROM scratchpad.dummy)",
     | "partitionColumn" -> "ID",
     | "lowerBound" -> "1",
     | "upperBound" -> "100000000",
     | "numPartitions" -> "10",
     | "user" -> _username,
     | "password" -> _password)).load
java.sql.SQLException: No suitable driver
  at java.sql.DriverManager.getDriver(DriverManager.java:315)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54)
  at scala.Option.getOrElse(Option.scala:121)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53)
  at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:315)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
  ... 56 elided





Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 27 December 2016 at 11:23, Deepak Sharma 
<deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote:
I meant ADD_JARS as you said --jars is not working for you with spark-shell.

Thanks
Deepak

On Tue, Dec 27, 2016 at 4:51 PM, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Ok just to be clear do you mean

ADD_JARS="~/jars/ojdbc6.jar" spark-shell

or

spark-shell --jars $ADD_JARS


Thanks



Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 27 December 2016 at 10:30, Deepak Sharma 
<deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote:
It works for me with spark 1.6 (--jars)
Please try this:
ADD_JARS="<<PATH_TO_JAR>>" spark-shell

Thanks
Deepak

On Tue, Dec 27, 2016 at 3:49 PM, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Thanks.

The problem is that with spark-shell --jars does not work! This is Spark 2 
accessing Oracle 12c

spark-shell --jars /home/hduser/jars/ojdbc6.jar

It comes back with

java.sql.SQLException: No suitable driver

unfortunately

and spark-shell uses spark-submit under the bonnet if you look at the shell file

"${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name 
"Spark shell" "$@"


hm







Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 27 December 2016 at 09:52, Deepak Sharma 
<deepakmc...@gmail.com<mailto:deepakmc...@gmail.com>> wrote:
Hi Mich
You can copy the jar to shared location and use --jars command line argument of 
spark-submit.
Who so ever needs  access to this jar , can refer to the shared path and access 
it using --jars argument.

Thanks
Deepak

On Tue, Dec 27, 2016 at 3:03 PM, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
When one runs in Local mode (one JVM) on an edge host (the host user accesses 
the cluster), it is possible to put additional jar file say accessing Oracle 
RDBMS tables in $SPARK_CLASSPATH. This works

export SPARK_CLASSPATH=~/user_jars/ojdbc6.jar

Normally a group of users can have read access to a shared directory like above 
and once they log in their shell will invoke an environment file that will have 
the above classpath plus additional parameters like $JAVA_HOME etc are set up 
for them.

However, if the user chooses to run spark through spark-submit with yarn, then 
the only way this will work in my research is to add the jar file as follows on 
every node of Spark cluster

in $SPARK_HOME/conf/spark-defaults.conf

Add the jar path to the following:

spark.executor.extraClassPath   /user_jars/ojdbc6.jar

Note that setting both spark.executor.extraClassPath and SPARK_CLASSPATH
will cause initialisation error

ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Found both spark.executor.extraClassPath and 
SPARK_CLASSPATH. Use only the former.

I was wondering if there are other ways of making this work in YARN mode, where 
every node of cluster will require this JAR file?

Thanks



--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>




--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>




--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>




--
Thanks
Deepak
www.bigdatabig.com<http://www.bigdatabig.com>
www.keosha.net<http://www.keosha.net>

--
Best regards,
Léo Biscassi

RE: Location for the additional jar files in Spark

Reply via email to