Thanks for the help. I am doing progress, but I found I need to do a bit of fiddling with excluding dependencies from spark in order to have mine take effect. As soon as I have a working pom I will post here as an example.
Alex Cozzi [email protected] ------------------------------------------------------ eBay is hiring! Check out our job openings http://ebay.referrals.selectminds.com/?et=OlVHMHJl On Jan 16, 2014, at 1:54 PM, Patrick Wendell <[email protected]> wrote: > Hey Alex, > > Maven profiles only affect the Spark build itself. They do not > transitively affect your own build. > > Checkout the docs for how to deploy applications on yarn: > http://spark.incubator.apache.org/docs/latest/running-on-yarn.html > > When compiling your application, just should explicitly add the hadoop > version you depend on to your own build (e.g. a hadoop-client > dependency). Take a look at the example here where we show adding > hadoop-client: > > http://spark.incubator.apache.org/docs/latest/quick-start.html > > When deploying Spark applications on YARN, you actually want to mark > spark as a provided dependency in your application's maven and bundle > your application as an assembly jar, then submit it with a Spark YARN > bundle to a YARN cluster. The instructions are the same as they were > in 0.8.1. > > For the spark jar you want to submit to YARN, you can download the > precompiled Spark one. > > It might make sense to try this pipeline with 0.8.1 and get it working > there. It sounds here more like you are dealing with getting the build > set-up rather than a particular issue with the 0.9.0 RC. > > - Patrick > > On Thu, Jan 16, 2014 at 1:13 PM, Alex Cozzi <[email protected]> wrote: >> Hi Patrick, >> thank you for testing. I think I found out what is wrong: I am trying to >> build my own examples that also depend on another library which in turns >> depends on hadoop 2.2. >> what was happening is that my library brings in hadoop 2.2, while spark >> depends on hadoop 1.04 and then I think I get conflict versions of the >> classes. >> >> A couple of things are not clear to me: >> >> 1: do the published artifacts support YARN and hadoop 2.2 or will I need to >> make my own build? >> 2: if they do, how do I activate the profiles in my maven config? I tried >> mvn -Pyarn compile but it does not work (maven says “[WARNING] The requested >> profile "yarn" could not be activated because it does not exist.”) >> >> >> essentially I would like to specify the spark dependencies as: >> >> <dependencies> >> <dependency> >> <groupId>org.scala-lang</groupId> >> <artifactId>scala-library</artifactId> >> <version>${scala.version}</version> >> </dependency> >> >> <dependency> >> <groupId>org.apache.spark</groupId> >> >> <artifactId>spark-core_${scala.tools.version}</artifactId> >> <version>0.9.0-incubating</version> >> </dependency> >> >> and tell maven to use the “yarn” profile for this dependency, but I do not >> seem to be able to make it work. >> Anybody has any suggestion? >> >> Alex
