Hi Nathan, I am also facing the issue with Spark 1.3. Did you find any workaround for this issue? Please help
Thanks Sathish On Thu, Apr 16, 2015 at 6:03 AM Nathan McCarthy < nathan.mccar...@quantium.com.au> wrote: > Its JTDS 1.3.1; http://sourceforge.net/projects/jtds/files/jtds/1.3.1/ > > I put that jar in /tmp on the driver/machine I’m running spark shell > from. > > Then I ran with ./bin/spark-shell --jars /tmp/jtds-1.3.1.jar --master > yarn-client > > So I’m guessing that --jars doesn’t set the class path for the > primordial class loader. And because its on the class path in ‘user land’ > I’m guessing > > Thinking a work around would be to merge my spark assembly jar with the > jtds driver… But it seems like a hack. The other thing I notice is there is > --file which lets me pass around files with the YARN distribute, so Im > thinking I can somehow use this if --jars doesn’t work. > > Really I need to understand how the spark class path is set when running > on YARN. > > > From: "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> > Date: Thursday, 16 April 2015 3:02 pm > To: Nathan <nathan.mccar...@quantium.com.au> > Cc: "user@spark.apache.org" <user@spark.apache.org> > Subject: Re: SparkSQL JDBC Datasources API when running on YARN - Spark > 1.3.0 > > Can you provide the JDBC connector jar version. Possibly the full JAR > name and full command you ran Spark with ? > > On Wed, Apr 15, 2015 at 11:27 AM, Nathan McCarthy < > nathan.mccar...@quantium.com.au> wrote: > >> Just an update, tried with the old JdbcRDD and that worked fine. >> >> From: Nathan <nathan.mccar...@quantium.com.au> >> Date: Wednesday, 15 April 2015 1:57 pm >> To: "user@spark.apache.org" <user@spark.apache.org> >> Subject: SparkSQL JDBC Datasources API when running on YARN - Spark 1.3.0 >> >> Hi guys, >> >> Trying to use a Spark SQL context’s .load(“jdbc", …) method to create a >> DF from a JDBC data source. All seems to work well locally (master = >> local[*]), however as soon as we try and run on YARN we have problems. >> >> We seem to be running into problems with the class path and loading up >> the JDBC driver. I’m using the jTDS 1.3.1 driver, >> net.sourceforge.jtds.jdbc.Driver. >> >> ./bin/spark-shell --jars /tmp/jtds-1.3.1.jar --master yarn-client >> >> When trying to run I get an exception; >> >> scala> sqlContext.load("jdbc", Map("url" -> >> "jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd", "dbtable" -> >> "CUBE.DIM_SUPER_STORE_TBL”)) >> >> java.sql.SQLException: No suitable driver found for >> jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd >> >> Thinking maybe we need to force load the driver, if I supply *“driver” >> -> “net.sourceforge.jtds.jdbc.Driver”* to .load we get; >> >> scala> sqlContext.load("jdbc", Map("url" -> >> "jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd", "driver" -> >> "net.sourceforge.jtds.jdbc.Driver", "dbtable" -> >> "CUBE.DIM_SUPER_STORE_TBL”)) >> >> java.lang.ClassNotFoundException: net.sourceforge.jtds.jdbc.Driver >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:191) >> at >> org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:97) >> at >> org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:290) >> at org.apache.spark.sql.SQLContext.load(SQLContext.scala:679) >> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21) >> >> Yet if I run a Class.forName() just from the shell; >> >> scala> Class.forName("net.sourceforge.jtds.jdbc.Driver") >> res1: Class[_] = class net.sourceforge.jtds.jdbc.Driver >> >> No problem finding the JAR. I’ve tried in both the shell, and running >> with spark-submit (packing the driver in with my application as a fat JAR). >> Nothing seems to work. >> >> I can also get a connection in the driver/shell no problem; >> >> scala> import java.sql.DriverManager >> import java.sql.DriverManager >> scala> >> DriverManager.getConnection("jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd") >> res3: java.sql.Connection = >> net.sourceforge.jtds.jdbc.JtdsConnection@2a67ecd0 >> >> I’m probably missing some class path setting here. In >> *jdbc.DefaultSource.createRelation* it looks like the call to >> *Class.forName* doesn’t specify a class loader so it just uses the >> default Java behaviour to reflectively get the class loader. It almost >> feels like its using a different class loader. >> >> I also tried seeing if the class path was there on all my executors by >> running; >> >> *import *scala.collection.JavaConverters._ >> >> sc.parallelize(*Seq*(1,2,3,4)).flatMap(_ => java.sql.DriverManager. >> *getDrivers*().asScala.map(d => *s**”**$*d* | **$*{d.acceptsURL( >> *"jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd"*)}*"* >> )).collect().foreach(*println*) >> >> This successfully returns; >> >> 15/04/15 01:07:37 INFO scheduler.DAGScheduler: Job 0 finished: collect >> at Main.scala:46, took 1.495597 s >> org.apache.derby.jdbc.AutoloadedDriver40 | false >> com.mysql.jdbc.Driver | false >> net.sourceforge.jtds.jdbc.Driver | true >> org.apache.derby.jdbc.AutoloadedDriver40 | false >> com.mysql.jdbc.Driver | false >> net.sourceforge.jtds.jdbc.Driver | true >> org.apache.derby.jdbc.AutoloadedDriver40 | false >> com.mysql.jdbc.Driver | false >> net.sourceforge.jtds.jdbc.Driver | true >> org.apache.derby.jdbc.AutoloadedDriver40 | false >> com.mysql.jdbc.Driver | false >> net.sourceforge.jtds.jdbc.Driver | true >> >> As a final test we tried with postgres driver and had the same problem. >> Any ideas? >> >> Cheers, >> Nathan >> > > > > -- > Deepak > >