Re: follow up Hadoop mavenization work

Alejandro Abdelnur Fri, 29 Jul 2011 08:01:37 -0700

Hi Joep,

Regarding your first paragraph question, this is a byproduct of using maven.

Regarding using a particular build, if you are using SNAPSHOT versions it
means you want to be on the 'head', you are on development mode. Of course
you can, as you indicate, to fix on a particular SNAPSHOT artifact by using
its exact timestamp. But the later it is something you normally would not
do. If I would have to do something like that I would use some kind of nano
version (the problem with this is the ordering) or other version qualifier.

Both Ant/Ivy & Maven download and cache JARs of dependencies (and in the
case of Maven, the plugin JARs used for the build). After you do a full
build with all profiles active (to make sure all plugin JARs have been
used), you can tar the ~/.m2 and use when in disconnected mode. Another
thing you can do (recommended) is o setup a Maven repo proxy in your
disconnected network and configure all your devel boxes to use the proxy.
You could seed the proxy with a full .m2 or by doing a clean (no ~/.m2 in
your devel box) build trough the proxy.

Finally, if you are building from trunk/ you are using ALL artifacts produce
by the whole project, if you are building from trunk/hdfs/ or trunk/mapred/
you are using previously published artifact (installed in ~/.m2 or deployed
to a maven repo) for all the modules that are not part of the current build.

Hope this clarifies your concerns.

Thanks.

Alejandro

On Thu, Jul 28, 2011 at 7:10 PM, Rottinghuis, Joep <jrottingh...@ebay.com>wrote:

> Alejandro,
>
> Are you trying the use-case when people will want to locally build a
> consistent set of common, hdfs, and mapreduce without the downstream
> projects depending on published Maven SNAPSHOTS?
> I'm working to get this going on 0.22 right now (see HDFS-843, HDFS-2214,
> and I'll have to file two equivalent bugs on mapreduce).
>
> Part of the problem is that the assumption was that people always compile
> hdfs against hadoop-common-0.xyz-SNAPSHOT.
> When applying one patch at a time from Jira attachments that may be fine.
>
> If I set up a Jenkins build I will want to make sure that first
> hadoop-common builds with a new build number (not snapshot), then hdfs
> against that same build number, then mapreduce against hadoop-common and
> hdfs.
> Otherwise you can get a situation when the mapreduce build is still running
> and hadoop-common build has already produced a new snapshot build.
>
> Local caching in ~/.m2 and ~/.ivy2 repos makes this situation even more
> complex.
>
> Having the ability to build without Internet connectivity is not just for
> laptops on the go. For corporate environments one may not want to have a
> build server have Internet connectivity.
> In that case should one do a build on a machine with connectivity first and
> then fork-lift the ~/.m2/repository directory over?
> Should any hadoop-common, hadoop-hdfs and hadoop-mapreduce artifacts be
> purged in that case (since they should be rebuilt locally)?
>
> Thanks,
>
> Joep
>
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:t...@cloudera.com]
> Sent: Thursday, July 28, 2011 4:41 PM
> To: general@hadoop.apache.org
> Subject: follow up Hadoop mavenization work
>
> Following up with Hadoop Common mavenization (HADOOP-6671) I've just posted
> a patch for HDFS mavenization (HDFS-2096)
>
> The HADOOP-6671 patch integrates all feedback received in the JIRA and,
> IMO, it is ready for prime time.
>
> In order not break HDFS and MAPRED which are still Ant based, there are 2
> patches HDFS-2196 & MAPREDUCE-2741that make some corrections in the ivy
> configuration to work correctly with the Hadoop common JAR (build/published
> by Mavenized build).
>
> HDFS-2096 is not 100% ready, some testcases are failing and native code
> testing is not wired, but everything else (compile, test, package, tar,
> binary, jdiff, etc is wired).
>
> * https://issues.apache.org/jira/browse/HADOOP-6671
> * https://issues.apache.org/jira/browse/HDFS-2196
> * https://issues.apache.org/jira/browse/MAPREDUCE-2741
> * https://issues.apache.org/jira/browse/HDFS-2096
>
> I know these are big changes and we'll have some hiccups, but the benefits
> are big (running testcases is faster, it easily works from IDEs, Maven build
> system can easily be understood by anybody that knows Maven).
>
> Keeping the patches current is time-consuming, because of this, it would be
> great if we can get in the ones ready (HADOOP-6671, HDFS-2196,
> MAPREDUCE-2741) so we can focus on the rest of the Mavenization work.
>
> Thanks.
>
> Alejandro
>

Re: follow up Hadoop mavenization work

Reply via email to