Re: Compiling Spark with a local hadoop profile
In root pom.xml : 2.2.0 You can override the version of hadoop with command similar to: -Phadoop-2.4 -Dhadoop.version=2.7.0 Cheers On Thu, Oct 8, 2015 at 11:22 AM, sbiookag wrote: > I'm modifying hdfs module inside hadoop, and would like the see the > reflection while i'm running spark on top of it, but I still see the native > hadoop behaviour. I've checked and saw Spark is building a really fat jar > file, which contains all hadoop classes (using hadoop profile defined in > maven), and deploy it over all workers. I also tried bigtop-dist, to > exclude > hadoop classes but see no effect. > > Is it possible to do such a thing easily, for example by small > modifications > inside the maven file? > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Compiling-Spark-with-a-local-hadoop-profile-tp14517.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > - > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >
Re: Compiling Spark with a local hadoop profile
Thanks Ted for reply. But this is not what I want. This would tell spark to read hadoop dependency from maven repository, which is the original version of hadoop. I myslef is modifying the hadoop code, and wanted to include them inside the spark fat jar. "Spark-Class" would run slaves with the fat jar created in the assembly folder, and that jar does not contain my modified classes. Something that confuses me is, what spark includes the hadoop classes in it's built jar output? Isn't it supposed to go and read from the hadoop folder in each worker node? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Compiling-Spark-with-a-local-hadoop-profile-tp14517p14519.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Compiling Spark with a local hadoop profile
> On 8 Oct 2015, at 19:31, sbiookag wrote: > > Thanks Ted for reply. > > But this is not what I want. This would tell spark to read hadoop dependency > from maven repository, which is the original version of hadoop. I myslef is > modifying the hadoop code, and wanted to include them inside the spark fat > jar. "Spark-Class" would run slaves with the fat jar created in the assembly > folder, and that jar does not contain my modified classes. it should if you have built a local hadoop version and done the -Phadoop-2.6 -Dhadoop.version=2.8.0-SNAPSHOT if you are rebuilding hadoop with an existing version number (e.g. 2.6.0, 2.7.1) then maven may not actually be picking up your new code > > Something that confuses me is, what spark includes the hadoop classes in > it's built jar output? Isn't it supposed to go and read from the hadoop > folder in each worker node? There's a hadoop-provided profile which you can build with; this should leave the hadoop artifacts (and other stuff expected to be in the far-end's classpath) out of the assembly - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org