Re: which line in SparkBuild.scala specifies hadoop-core-xxx.jar?

Azuryy Sat, 21 Dec 2013 06:32:15 -0800

Hi Nan

I think there is a typo here:


"file:///Users/nanzhu/.m2/repository”),

It should be lowercase.


Sent from my iPhone

> On 2013年12月21日, at 6:17, Nan Zhu <zhunanmcg...@gmail.com> wrote:
> 
> Hi, Gary,
> 
> Thank you very much
> 
> This afternoon, I tried to compile spark with my customized hadoop, it 
> finally works
> 
> For those who shared the same problem with me:
> 
> 1. add the following line to SparkBuild.scala
> 
> resolvers ++= Seq("Local Hadoop Repo" at 
> "file:///Users/nanzhu/.m2/repository”),
> 
> 2. install your customized jars
> 
> mvn install:install-file 
> -Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-client-1.2.2-SNAPSHOT.jar 
> -DgroupId=org.apache.hadoop -DartifactId=hadoop-core -Dversion=1.2.2-SNAPSHOT 
> -Dpackaging=jar -DgeneratePom=true
> 
> mvn install:install-file 
> -Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-core-1.2.2-SNAPSHOT.jar 
> -DgroupId=org.apache.hadoop -DartifactId=hadoop-core -Dversion=1.2.2-SNAPSHOT 
> -Dpackaging=jar -DgeneratePom=true
> 
> 3. set SPARK_HADOOP_VERSION to 1.22-SNAPSHOT
> 
> 4. add the dependency of hadoop-core
> 
> search org.apache.hadoop" % "hadoop-client" % hadoopVersion 
> excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib) in 
> project/SparkBuild.scala
> 
> add "org.apache.hadoop" % "hadoop-core" % hadoopVersion 
> excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib), below it
> 
> 5. compile spark
> 
> note:
> 
> a. in 1, I don’t know why  resolvers ++= Seq(Resolver.file("Local Maven 
> Repo", file(Path.userHome + "/.m2/repository"))), cannot resolve my 
> directory, so I have to manually add resolvers ++= Seq("Local Hadoop Repo" at 
> "file:///Users/nanzhu/.m2/repository”). It is still weird that  Seq("Local 
> Hadoop Repo”, file("Users/nanzhu/.m2/repository”)) doesn’t work….
> 
> b. in 4, the cllient.jar dependency cannot download core.jar in automatic 
> (why?) I have to add an explicit dependency on core.jar
> 
> 
> Best,
> 
> -- 
> Nan Zhu
> 
>> On Monday, December 16, 2013 at 2:41 PM, Gary Malouf wrote:
>> 
>> Check out the dependencies for the version of hadoop-client you are using - 
>> I think you will find that hadoop-core is present there.
>> 
>> 
>> 
>> 
>>> On Mon, Dec 16, 2013 at 1:28 PM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
>>> Hi, Gary, 
>>> 
>>> The page says Spark uses hadoop-client.jar to interact with HDFS, but why 
>>> it also downloads hadoop-core?
>>> 
>>> Do I just need to change the dependency on hadoop-client to my local repo?
>>> 
>>> Best,
>>> 
>>> -- 
>>> Nan Zhu
>>> School of Computer Science,
>>> McGill University
>>> 
>>> 
>>>> On Monday, December 16, 2013 at 9:05 AM, Gary Malouf wrote:
>>>> 
>>>> Hi Nan, check out the 'Note about Hadoop Versions' on 
>>>> http://spark.incubator.apache.org/docs/latest/
>>>> 
>>>> Let us know if this does not solve your problem.
>>>> 
>>>> Gary
>>>> 
>>>> 
>>>>> On Mon, Dec 16, 2013 at 8:19 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
>>>>> Hi, Azuryy
>>>>> 
>>>>> Thank you for the reply
>>>>> 
>>>>> So you compiled Spark with mvn?
>>>>> 
>>>>> I’m watching the pom.xml, I think it is doing the same work as 
>>>>> SparkBuild.Scala, 
>>>>> 
>>>>> I’m still confused by that, in Spark, some class utilized some classes 
>>>>> like InputFormat, I assume that this should be included in 
>>>>> hadoop-core.jar,
>>>>> 
>>>>> but I didn’t find any line specified hadoop-core-1.0.4.jar in pom.xml and 
>>>>> SparkBuild.scala, 
>>>>> 
>>>>> Can you explain a bit to me?
>>>>> 
>>>>> Best,
>>>>> 
>>>>> -- 
>>>>> Nan Zhu
>>>>> School of Computer Science,
>>>>> McGill University
>>>>> 
>>>>>> On Monday, December 16, 2013 at 3:58 AM, Azuryy Yu wrote:
>>>>>> 
>>>>>> Hi Nan,
>>>>>> I am also using our customized hadoop, so you need to modiy the pom.xml, 
>>>>>> but before this change, you should install your customized hadoop-* jar 
>>>>>> in the local maven repo.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Sun, Dec 15, 2013 at 2:45 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
>>>>>>> Hi, all
>>>>>>> 
>>>>>>> I’m trying to compile Spark with a customized version of hadoop, where 
>>>>>>> I modify the implementation of DFSInputStream, 
>>>>>>> 
>>>>>>> I would like to SparkBuild.scala to make spark compile with my 
>>>>>>> hadoop-core.xxx.jar instead of download a original one?
>>>>>>> 
>>>>>>> I only found hadoop-client-xxx.jar and some lines about yarn jars in 
>>>>>>> ScalaBuild.scala, 
>>>>>>> 
>>>>>>> Can you tell me which line I should modify to achieve the goal?
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> -- 
>>>>>>> Nan Zhu
>>>>>>> School of Computer Science,
>>>>>>> McGill University
>

Re: which line in SparkBuild.scala specifies hadoop-core-xxx.jar?

Reply via email to