Re: which line in SparkBuild.scala specifies hadoop-core-xxx.jar?

Nan Zhu Sat, 21 Dec 2013 10:01:19 -0800

Hi, Azuryy,

I’m working on macbook pro


so it is indeed “Users”

Best,  

--  
Nan Zhu


On Saturday, December 21, 2013 at 9:31 AM, Azuryy wrote:

> Hi Nan
>  
> I think there is a typo here:
>  
> "file:///Users/nanzhu/.m2/repository”),
>  
> It should be lowercase.
>  
>  
> Sent from my iPhone
>  
> On 2013年12月21日, at 6:17, Nan Zhu <zhunanmcg...@gmail.com 
> (mailto:zhunanmcg...@gmail.com)> wrote:
>  
> > Hi, Gary,
> >  
> > Thank you very much
> >  
> > This afternoon, I tried to compile spark with my customized hadoop, it 
> > finally works
> >  
> > For those who shared the same problem with me:
> >  
> > 1. add the following line to SparkBuild.scala
> >  
> > resolvers ++= Seq("Local Hadoop Repo" at 
> > "file:///Users/nanzhu/.m2/repository”),
> >  
> > 2. install your customized jars
> >  
> > mvn install:install-file 
> > -Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-client-1.2.2-SNAPSHOT.jar
> >  -DgroupId=org.apache.hadoop -DartifactId=hadoop-core 
> > -Dversion=1.2.2-SNAPSHOT -Dpackaging=jar -DgeneratePom=true
> >  
> > mvn install:install-file 
> > -Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-core-1.2.2-SNAPSHOT.jar 
> > -DgroupId=org.apache.hadoop -DartifactId=hadoop-core 
> > -Dversion=1.2.2-SNAPSHOT -Dpackaging=jar -DgeneratePom=true
> >  
> > 3. set SPARK_HADOOP_VERSION to 1.22-SNAPSHOT
> >  
> > 4. add the dependency of hadoop-core
> >  
> > search org.apache.hadoop" % "hadoop-client" % hadoopVersion 
> > excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib) in 
> > project/SparkBuild.scala
> >  
> > add "org.apache.hadoop" % "hadoop-core" % hadoopVersion 
> > excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib), below it
> >  
> > 5. compile spark
> >  
> > note:
> >  
> > a. in 1, I don’t know why  resolvers ++= Seq(Resolver.file("Local Maven 
> > Repo", file(Path.userHome + "/.m2/repository"))), cannot resolve my 
> > directory, so I have to manually add resolvers ++= Seq("Local Hadoop Repo" 
> > at "file:///Users/nanzhu/.m2/repository”). It is still weird that  
> > Seq("Local Hadoop Repo”, file("Users/nanzhu/.m2/repository”)) doesn’t work….
> >  
> > b. in 4, the cllient.jar dependency cannot download core.jar in automatic 
> > (why?) I have to add an explicit dependency on core.jar
> >  
> >  
> > Best,  
> >  
> > --  
> > Nan Zhu
> >  
> >  
> > On Monday, December 16, 2013 at 2:41 PM, Gary Malouf wrote:
> >  
> > > Check out the dependencies for the version of hadoop-client you are using 
> > > - I think you will find that hadoop-core is present there.
> > >  
> > >  
> > >  
> > >  
> > > On Mon, Dec 16, 2013 at 1:28 PM, Nan Zhu <zhunanmcg...@gmail.com 
> > > (mailto:zhunanmcg...@gmail.com)> wrote:
> > > > Hi, Gary,   
> > > >  
> > > > The page says Spark uses hadoop-client.jar to interact with HDFS, but 
> > > > why it also downloads hadoop-core?
> > > >  
> > > > Do I just need to change the dependency on hadoop-client to my local 
> > > > repo?  
> > > >  
> > > > Best,  
> > > >  
> > > > --  
> > > > Nan Zhu
> > > > School of Computer Science,
> > > > McGill University
> > > >  
> > > >  
> > > >  
> > > >  
> > > > On Monday, December 16, 2013 at 9:05 AM, Gary Malouf wrote:
> > > >  
> > > > > Hi Nan, check out the 'Note about Hadoop Versions' on 
> > > > > http://spark.incubator.apache.org/docs/latest/
> > > > >  
> > > > > Let us know if this does not solve your problem.  
> > > > >  
> > > > > Gary
> > > > >  
> > > > >  
> > > > > On Mon, Dec 16, 2013 at 8:19 AM, Nan Zhu <zhunanmcg...@gmail.com 
> > > > > (mailto:zhunanmcg...@gmail.com)> wrote:
> > > > > > Hi, Azuryy  
> > > > > >  
> > > > > > Thank you for the reply
> > > > > >  
> > > > > > So you compiled Spark with mvn?
> > > > > >  
> > > > > > I’m watching the pom.xml, I think it is doing the same work as 
> > > > > > SparkBuild.Scala,   
> > > > > >  
> > > > > > I’m still confused by that, in Spark, some class utilized some 
> > > > > > classes like InputFormat, I assume that this should be included in 
> > > > > > hadoop-core.jar,
> > > > > >  
> > > > > > but I didn’t find any line specified hadoop-core-1.0.4.jar in 
> > > > > > pom.xml and SparkBuild.scala,   
> > > > > >  
> > > > > > Can you explain a bit to me?
> > > > > >  
> > > > > > Best,  
> > > > > >  
> > > > > > --  
> > > > > > Nan Zhu
> > > > > > School of Computer Science,
> > > > > > McGill University
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > On Monday, December 16, 2013 at 3:58 AM, Azuryy Yu wrote:
> > > > > >  
> > > > > > > Hi Nan,
> > > > > > > I am also using our customized hadoop, so you need to modiy the 
> > > > > > > pom.xml, but before this change, you should install your 
> > > > > > > customized hadoop-* jar in the local maven repo.
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > On Sun, Dec 15, 2013 at 2:45 AM, Nan Zhu <zhunanmcg...@gmail.com 
> > > > > > > (mailto:zhunanmcg...@gmail.com)> wrote:
> > > > > > > > Hi, all  
> > > > > > > >  
> > > > > > > > I’m trying to compile Spark with a customized version of 
> > > > > > > > hadoop, where I modify the implementation of DFSInputStream,  
> > > > > > > >  
> > > > > > > > I would like to SparkBuild.scala to make spark compile with my 
> > > > > > > > hadoop-core.xxx.jar instead of download a original one?  
> > > > > > > >  
> > > > > > > > I only found hadoop-client-xxx.jar and some lines about yarn 
> > > > > > > > jars in ScalaBuild.scala,  
> > > > > > > >  
> > > > > > > > Can you tell me which line I should modify to achieve the goal?
> > > > > > > >  
> > > > > > > > Best,  
> > > > > > > >  
> > > > > > > > --  
> > > > > > > > Nan Zhu
> > > > > > > > School of Computer Science,
> > > > > > > > McGill University
> > > > > > > >  
> > > > > > > >  
> > > > > > >  
> > > > > >  
> > > > >  
> > > >  
> > >  
> >

Re: which line in SparkBuild.scala specifies hadoop-core-xxx.jar?

Reply via email to