If you are using sbt, I personally use sbt-pack to pack all dependencies under a certain folder and then I set those jars in the spark config
// just for demo I load this through config file overridden by environment variables val sparkJars = Seq ("/ROOT_OF_YOUR_PROJECT/target/pack/lib/YOUR_JAR_DEPENDENCY.jar","/ROOT_OF_YOUR_PROJECT/target/pack/lib/YOUR_JAR_DEPENDENCY.jar") val conf = new SparkConf() .setMaster(sparkMaster) .setAppName(sparkApp) ..... .setJars(sparkJars) then run it with sbt pack run On Mon, Mar 14, 2016 at 11:58 PM, Tristan Nixon <st...@memeticlabs.org> wrote: > I see - so you want the dependencies pre-installed on the cluster nodes so > they do not need to be submitted along with the job jar? > > Where are you planning on deploying/running spark? Do you have your own > cluster or are you using AWS/other IaaS/PaaS provider? > > Somehow you’ll need to get the dependencies onto each node and add them to > Spark’s classpaths. You could modify an existing VM image or use chef to > distribute the jars and update the class-paths. > > On Mar 14, 2016, at 5:26 PM, prateek arora <prateek.arora...@gmail.com> > wrote: > > Hi > > I do not want create single jar that contains all the other dependencies . > because it will increase the size of my spark job jar . > so i want to copy all libraries in cluster using some automation process . > just like currently i am using chef . > but i am not sure is it a right method or not ? > > > Regards > Prateek > > > On Mon, Mar 14, 2016 at 2:31 PM, Jakob Odersky <ja...@odersky.com> wrote: > >> Have you tried setting the configuration >> `spark.executor.extraLibraryPath` to point to a location where your >> .so's are available? (Not sure if non-local files, such as HDFS, are >> supported) >> >> On Mon, Mar 14, 2016 at 2:12 PM, Tristan Nixon <st...@memeticlabs.org> >> wrote: >> > What build system are you using to compile your code? >> > If you use a dependency management system like maven or sbt, then you >> should be able to instruct it to build a single jar that contains all the >> other dependencies, including third-party jars and .so’s. I am a maven user >> myself, and I use the shade plugin for this: >> > https://maven.apache.org/plugins/maven-shade-plugin/ >> > >> > However, if you are using SBT or another dependency manager, someone >> else on this list may be able to give you help on that. >> > >> > If you’re not using a dependency manager - well, you should be. Trying >> to manage this manually is a pain that you do not want to get in the way of >> your project. There are perfectly good tools to do this for you; use them. >> > >> >> On Mar 14, 2016, at 3:56 PM, prateek arora <prateek.arora...@gmail.com> >> wrote: >> >> >> >> Hi >> >> >> >> Thanks for the information . >> >> >> >> but my problem is that if i want to write spark application which >> depend on >> >> third party libraries like opencv then whats is the best approach to >> >> distribute all .so and jar file of opencv in all cluster ? >> >> >> >> Regards >> >> Prateek >> >> >> >> >> >> >> >> -- >> >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-distribute-dependent-files-so-jar-across-spark-worker-nodes-tp26464p26489.html >> >> Sent from the Apache Spark User List mailing list archive at >> Nabble.com <http://nabble.com>. >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> > For additional commands, e-mail: user-h...@spark.apache.org >> > >> > > >