Or you can do sc.addJar(/path/to/the/jar), i haven't tested with HDFS path though it works fine with local path.
Thanks Best Regards On Wed, Jun 10, 2015 at 10:17 AM, Jörn Franke <jornfra...@gmail.com> wrote: > I am not sure they work with HDFS pathes. You may want to look at the > source code. Alternatively you can create a "fat" jar containing all jars > (let your build tool set correctly METAINF). This always works. > > Le mer. 10 juin 2015 à 6:22, Dong Lei <dong...@microsoft.com> a écrit : > >> Thanks So much! >> >> >> >> I did put sleep on my code to have the UI available. >> >> >> >> Now from the UI, I can see: >> >> · In the “SparkProperty” Section, the spark.jars and >> spark.files are set as what I want. >> >> · In the “Classpath Entries” Section, my jars and files paths >> are there(with a HDFS path) >> >> >> >> And I check the HTTP file server directory, the stuctrue is like: >> >> D:\data\temp >> >> \ --spark-UUID >> >> \-- httpd-UUID >> >> \jars [*empty*] >> >> \files [*empty*] >> >> >> >> So I guess the files and jars and not properly downloaded from HDFS to >> these folders? >> >> >> >> I’m using standalone mode. >> >> >> >> Any ideas? >> >> >> >> Thanks >> >> Dong Lei >> >> >> >> *From:* Akhil Das [mailto:ak...@sigmoidanalytics.com] >> *Sent:* Tuesday, June 9, 2015 4:46 PM >> >> >> *To:* Dong Lei >> *Cc:* user@spark.apache.org >> *Subject:* Re: ClassNotDefException when using spark-submit with >> multiple jars and files located on HDFS >> >> >> >> You can put a Thread.sleep(100000) in the code to have the UI available >> for quiet some time. (Put it just before starting any of your >> transformations) Or you can enable the spark history server >> <https://spark.apache.org/docs/latest/monitoring.html> too. I believe >> --jars >> <https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management> >> would download the dependency jars on all your worker machines (can be >> found in spark work dir of your application along with stderr stdout files). >> >> >> Thanks >> >> Best Regards >> >> >> >> On Tue, Jun 9, 2015 at 1:29 PM, Dong Lei <dong...@microsoft.com> wrote: >> >> Thanks Akhil: >> >> >> >> The driver fails so fast to get a look at 4040. Is there any other way to >> see the download and ship process of the files? >> >> >> >> Is driver supposed to download these jars from HDFS to some location, >> then ship them to excutors? >> >> I can see from log that the driver downloaded the application jar but not >> the other jars specified by “—jars”. >> >> >> >> Or I misunderstand the usage of “--jars”, and the jars should be already >> in every worker, driver will not download them? >> >> Is there some useful docs? >> >> >> >> Thanks >> >> Dong Lei >> >> >> >> >> >> *From:* Akhil Das [mailto:ak...@sigmoidanalytics.com] >> *Sent:* Tuesday, June 9, 2015 3:24 PM >> *To:* Dong Lei >> *Cc:* user@spark.apache.org >> *Subject:* Re: ClassNotDefException when using spark-submit with >> multiple jars and files located on HDFS >> >> >> >> Once you submits the application, you can check in the driver UI (running >> on port 4040) Environment Tab to see whether those jars you added got >> shipped or not. If they are shipped and still you are getting NoClassDef >> exceptions then it means that you are having a jar conflict which you can >> resolve by putting the jar with the class in it on the top of your >> classpath. >> >> >> Thanks >> >> Best Regards >> >> >> >> On Tue, Jun 9, 2015 at 9:05 AM, Dong Lei <dong...@microsoft.com> wrote: >> >> Hi, spark-users: >> >> >> >> I’m using spark-submit to submit multiple jars and files(all in HDFS) to >> run a job, with the following command: >> >> >> >> Spark-submit >> >> --class myClass >> >> --master spark://localhost:7077/ >> >> --deploy-mode cluster >> >> --jars hdfs://localhost/1.jar, hdfs://localhost/2.jar >> >> --files hdfs://localhost/1.txt, hdfs://localhost/2.txt >> >> hdfs://localhost/main.jar >> >> >> >> the stderr in the driver showed java.lang.ClassNotDefException for a >> class in 1.jar. >> >> >> >> I checked the log that spark has added these jars: >> >> INFO SparkContext: Added JAR hdfs:// …1.jar >> >> INFO SparkContext: Added JAR hdfs:// …2.jar >> >> >> >> In the folder of the driver, I only saw the main.jar is copied to that >> place, *but the other jars and files were not there* >> >> >> >> Could someone explain *how should I pass the jars and files* needed by >> the main jar to spark? >> >> >> >> If my class in main.jar refer to these files with a relative path, *will >> spark copy these files into one folder*? >> >> >> >> BTW, my class works in a client mode with all jars and files in local. >> >> >> >> Thanks >> >> Dong Lei >> >> >> >> >> >