Re: Compiling Spark with a local hadoop profile

2015-10-08 Thread Ted Yu
In root pom.xml :
2.2.0

You can override the version of hadoop with command similar to:
-Phadoop-2.4 -Dhadoop.version=2.7.0

Cheers

On Thu, Oct 8, 2015 at 11:22 AM, sbiookag  wrote:

> I'm modifying hdfs module inside hadoop, and would like the see the
> reflection while i'm running spark on top of it, but I still see the native
> hadoop behaviour. I've checked and saw Spark is building a really fat jar
> file, which contains all hadoop classes (using hadoop profile defined in
> maven), and deploy it over all workers. I also tried bigtop-dist, to
> exclude
> hadoop classes but see no effect.
>
> Is it possible to do such a thing easily, for example by small
> modifications
> inside the maven file?
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Compiling-Spark-with-a-local-hadoop-profile-tp14517.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: Compiling Spark with a local hadoop profile

2015-10-08 Thread sbiookag
Thanks Ted for reply.

But this is not what I want. This would tell spark to read hadoop dependency
from maven repository, which is the original version of hadoop. I myslef is
modifying the hadoop code, and wanted to include them inside the spark fat
jar. "Spark-Class" would run slaves with the fat jar created in the assembly
folder, and that jar does not contain my modified classes. 

Something that confuses me is, what spark includes the hadoop classes in
it's built jar output? Isn't it supposed to go and read from the hadoop
folder in each worker node?



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Compiling-Spark-with-a-local-hadoop-profile-tp14517p14519.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Compiling Spark with a local hadoop profile

2015-10-09 Thread Steve Loughran

> On 8 Oct 2015, at 19:31, sbiookag  wrote:
> 
> Thanks Ted for reply.
> 
> But this is not what I want. This would tell spark to read hadoop dependency
> from maven repository, which is the original version of hadoop. I myslef is
> modifying the hadoop code, and wanted to include them inside the spark fat
> jar. "Spark-Class" would run slaves with the fat jar created in the assembly
> folder, and that jar does not contain my modified classes. 

it should if you have built a local hadoop version and done the -Phadoop-2.6 
-Dhadoop.version=2.8.0-SNAPSHOT

if you are rebuilding hadoop with an existing version number (e.g. 2.6.0, 
2.7.1) then maven may not actually be picking up your new code


> 
> Something that confuses me is, what spark includes the hadoop classes in
> it's built jar output? Isn't it supposed to go and read from the hadoop
> folder in each worker node?


There's a hadoop-provided profile which you can build with; this should leave 
the hadoop artifacts (and other stuff expected to be in the far-end's 
classpath) out of the assembly

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org