[jira] Commented: (HADOOP-5107) split the core, hdfs, and mapred jars from each other and publish them independently to the Maven repository

Giridharan Kesavan (JIRA) Fri, 25 Sep 2009 02:32:44 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759460#action_12759460
 ]


Giridharan Kesavan commented on HADOOP-5107:
--------------------------------------------

bq. The patch doesn't work when we go off-line for subsequent runs. The 
off-line feature is missing in all the projects. Without this feature, it tries 
to download maven-ant-tasks.jar itself again and gets stuck. 

ivy doesnt work offline. Everytime we do a build whether the dependencies are 
present in the cache or not it goes and verifies the repo. If the dependencies 
are present locally it doesn't download. Same is the case with 
mvn-ant-task.jar. It doesnt download the jar everytime as usetimestamp is set 
to true.

bq. In many files, in particular the ivy.xml files of contrib projects, most of 
the changes are not required and are redundant as the patch removes them and 
simply adds them again changing the format into a single line. Undoing these 
changes will greatly reduce the patch size 

When dependencies are put in a single line the ivy.xml file looks refined and 
re-formatting would greatly help in understanding. 
 
bq. In mapreduce and hdfs ivy.xml files, some cleanup is done. The earlier 
client and server specific dependencies looked good and natural too. Did you 
remove that because the classification was premature or it didn't gel well with 
your changes? 

This patch uses maven and ivy for publishing and resolving resp. Ivy work's on 
configuration while maven works on scope. I 've tried my best to utilize best 
of both the worlds.

bq. mapreduce build.xml: Do we need separate mvn-install and 
mvn-install-mapred? Even if it is needed, mvn-install should depend on 
mvn-install-mapred. A case of reuse.  
Until last couple of days hdfs depended on both mapred and common. And mapred 
depended on hdfs and common.  Hence we had a situation to publish only mapred 
and hdfs jar and not the corresponding test jars. I didn't want to re-use the 
mvn-install-mapred target as I was expected to cleanup this target once the 
circular dependency issue is resolved.

bq. common project: Should we take this as an opportunity and rename the core 
jar to common jar before publishing? It looks odd the project name is common 
while the jar's name refers to core. 
That would be quite a work and I would defn. want that to be in a diff jira.

bq. I think that in both mapred and hdfs, clean-cache should not delete the 
whole ${user.home}/.ivy2/cache/org.apache.hadoop/hadoop-core directory for 
example. It works for now, but different projects may work with different 
versions of the jar, so mapred's clean-cache should only delete the 
corresponding version of the jar. Same with the other directories in the cache. 
Thoughts? 
Its not just the jar files that the cache stores, it also converts the poms and 
stores them as ivy.xml files for different ivy configurations. And the best way 
to clean them up is to clean the corresponding artifact folder in the cache.

bq.Should `ant clean` delete maven-ant-tasks.jar every time? I guess not. 
When I call ant clean I would defn. expect a clean workspace. 
Also there is a different reason. I ve seen ppl doing a ctrl-c half way when 
the ivy/maven-ant-task. jar is downloading. So the jar is partially downloaded. 
Next time when a user runs the build and the build fails for the jar file being 
corrupt, they have to go delete them manually.

Thanks for the comments.

> split the core, hdfs, and mapred jars from each other and publish them 
> independently to the Maven repository
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5107
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build
>    Affects Versions: 0.20.0
>            Reporter: Owen O'Malley
>            Assignee: Giridharan Kesavan
>         Attachments: common-trunk-v1.patch, common-trunk-v4.patch, 
> common-trunk.patch, hadoop-hdfsd-v4.patch, hdfs-trunk-v1.patch, 
> hdfs-trunk-v2.patch, hdfs-trunk.patch, mapred-trunk-v1.patch, 
> mapred-trunk-v2.patch, mapred-trunk-v3.patch, mapred-trunk-v4.patch, 
> mapred-trunk-v5.patch, mapreduce-trunk.patch
>
>
> I think to support splitting the projects, we should publish the jars for 
> 0.20.0 as independent jars to the Maven repository 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5107) split the core, hdfs, and mapred jars from each other and publish them independently to the Maven repository

Reply via email to