Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Tommaso Teofili
Hi Marshall,
I mostly agree with you for putting binaries into lib dir from SVN for the
release because we would like the checkout/compile phase to be as easy and
quick as possible.
Though it could be somehow annoying I think it would be nice in the near
future (after the release) to leave those binaries out and use long time
stable artifact versions from the central Maven repository where possible.
Tommaso

2009/9/20 Marshall Schor m...@schor.com

 I'm going through our build process for the Sandbox Distribution, and
 consolidating/unifying many aspects of it (not yet checked in - still a
 work in progress).

 One area I've examined:  Many sandbox projects rely on and redistribute
 3rd party Jars (that are Apache licensed, or are otherwise OK to
 distribute).

 All (I think) of these Jars are dependencies (in the Maven system) that
 maven can automatically download from its repositories, to the .m2 local
 repo.

 Many sandbox projects put these Jars into the project's lib/ directory.

 Some sandbox projects check in these Jars into SVN, and build their
 distributions by copying their lib/ dir to the distribution; others use
 the maven dependency plugin to get these jars into the local repository
 (if not already there) and then copy that into the distribution.

 After thinking about this for a while, and considering both methods, I
 think the most reliable way to handle 3rd party Jars is to manually put
 them into the lib/ directory, once, and then check the lib/ directory
 into SVN.  This avoids build issues in the future which could occur if
 the Jar obtained from the maven dependency plugin is somehow corrupted,
 or changes level, etc.  Also, having the Jars in SVN insures that
 whatever work we do to update the LICENSE/NOTICE files for those Jars
 remains valid (because the Jar doesn't (potentially) change).

 Examples of project copying Jars into its lib/ dir manually:
  BSFAnnotator
  DictionaryAnnotator
  RegularExpressionAnnotator
  SimpleServer
 Examples of project using the maven dependency plugin:
  ConfigurableFeatureExtractor (work in progress, pom doing this not yet
 checked in)
  Lucas

 So, unless there are strong objections, I'm going to be changing the
 sandbox build to consistently do the following:
  - for 3rd party dependencies - expect the Jars to be manually put in
 the lib/ dir and checked into SVN
  - put the lib/ into the bin distribution
  - if a pear is being built, put the lib/ into the pear
  - not automatically populate the lib/ using the maven dependency
 plugin mechanism
  - change the maven descriptors for these dependencies, where needed,
 to scopesystem/scope indicating these jars are in the lib/, and add
 the systemPaht element.

 This places a small burden on the developers when creating a new project
 to obtain the needed 3rd party Jars once and put them in the lib/ dir.
 One way to do this is to initially code the maven 3rd party Jar
 dependencies with no scope (defaulting thereby to compile) and let
 maven get these Jars from searching its repositories.  Then, copy them
 from the .m2 local repository to the lib/ dir, and change the scope to
 system, and set systemPath${basedir}/lib/.jar/systemPath.

 -Marshall



Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Jukka Zitting
Hi,

On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote:
 After thinking about this for a while, and considering both methods, I
 think the most reliable way to handle 3rd party Jars is to manually put
 them into the lib/ directory, once, and then check the lib/ directory
 into SVN.  This avoids build issues in the future which could occur if
 the Jar obtained from the maven dependency plugin is somehow corrupted,
 or changes level, etc.  Also, having the Jars in SVN insures that
 whatever work we do to update the LICENSE/NOTICE files for those Jars
 remains valid (because the Jar doesn't (potentially) change).

By policy non-SNAPSHOT artifact in the Maven repository never change,
and each artifact is accompanied by checksums that guard against
corruption. It's possible for a user to mess up the files in their
local Maven repository, but it's probably just as likely that they'd
mess up any files in ./lib.

To me the proposed solution sounds like extra effort with little or no benefit.

BR,

Jukka Zitting


Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Thilo Goetz
Jukka Zitting wrote:
 Hi,
 
 On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote:
 After thinking about this for a while, and considering both methods, I
 think the most reliable way to handle 3rd party Jars is to manually put
 them into the lib/ directory, once, and then check the lib/ directory
 into SVN.  This avoids build issues in the future which could occur if
 the Jar obtained from the maven dependency plugin is somehow corrupted,
 or changes level, etc.  Also, having the Jars in SVN insures that
 whatever work we do to update the LICENSE/NOTICE files for those Jars
 remains valid (because the Jar doesn't (potentially) change).
 
 By policy non-SNAPSHOT artifact in the Maven repository never change,
 and each artifact is accompanied by checksums that guard against
 corruption. It's possible for a user to mess up the files in their
 local Maven repository, but it's probably just as likely that they'd
 mess up any files in ./lib.
 
 To me the proposed solution sounds like extra effort with little or no 
 benefit.

One benefit I see is that you have only one NOTICE/LICENSE file
for the source and binary distribution.  What's more, if your
source distribution does not include the dependencies and you
therefore don't mention them in your NOTICE/LICENSE files, it
might come as a surprise to users that the build pulls in all
those files they didn't know about (or they don't even notice,
which would be even worse).

--Thilo

 
 BR,
 
 Jukka Zitting


Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Marshall Schor
Jukka Zitting wrote:
 Hi,

 On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote:
   
 After thinking about this for a while, and considering both methods, I
 think the most reliable way to handle 3rd party Jars is to manually put
 them into the lib/ directory, once, and then check the lib/ directory
 into SVN.  This avoids build issues in the future which could occur if
 the Jar obtained from the maven dependency plugin is somehow corrupted,
 or changes level, etc.  Also, having the Jars in SVN insures that
 whatever work we do to update the LICENSE/NOTICE files for those Jars
 remains valid (because the Jar doesn't (potentially) change).
 

 By policy non-SNAPSHOT artifact in the Maven repository never change,
 and each artifact is accompanied by checksums that guard against
 corruption. It's possible for a user to mess up the files in their
 local Maven repository, but it's probably just as likely that they'd
 mess up any files in ./lib.

 To me the proposed solution sounds like extra effort with little or no 
 benefit.
   
Interesting.  Is there any *enforcement* of the policy of not updating
non-SNAPSHOT artifacts (with their checksums)?  For instance, if someone
(say, a non-apache project, by accident) runs a maven script to deploy a
changed artifact (and its checksum) to maven central, would that be
blocked somehow?

It also seems that if there is this policy, that the default for the
maven repository control for *releases* would have an
updatePolicynever/updatePolicy in the maven SuperPom (relying on the
policy that the repository artifact is never updated), but it isn't
there.  There is an updatePolicynever/updatePolicy in the SuperPom
for its own plugin-repository, though.

-Marshall
 BR,

 Jukka Zitting


   


Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Thilo Goetz
Jukka Zitting wrote:
...
 The LICENSE/NOTICE point that Thilo brought up is a valid one, though
 especially when the default build embeds external dependencies to the
 build target, it's quite OK to include also their licenses in the
 licensing metadata even if those dependencies strictly speaking aren't
 being shipped as a part of the source distribution.

You think so?  Even with all the nothing goes into a NOTICE
file unless it has to hubbub?  We don't want to create a new
maven flame war when we take this release to the IPMC :-)

However in that case I guess I would also prefer to have
maven fetch the dependencies.  Seems wasteful to have them
in svn.  Not sure how much work it is to change the build,
though...

--Thilo


Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Thilo Goetz
See my other mail in this thread, but also inline
below.

--Thilo

Marshall Schor wrote:
 
 Thilo Goetz wrote:
 Jukka Zitting wrote:
   
 Hi,

 On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote:
 
 After thinking about this for a while, and considering both methods, I
 think the most reliable way to handle 3rd party Jars is to manually put
 them into the lib/ directory, once, and then check the lib/ directory
 into SVN.  This avoids build issues in the future which could occur if
 the Jar obtained from the maven dependency plugin is somehow corrupted,
 or changes level, etc.  Also, having the Jars in SVN insures that
 whatever work we do to update the LICENSE/NOTICE files for those Jars
 remains valid (because the Jar doesn't (potentially) change).
   
 By policy non-SNAPSHOT artifact in the Maven repository never change,
 and each artifact is accompanied by checksums that guard against
 corruption. It's possible for a user to mess up the files in their
 local Maven repository, but it's probably just as likely that they'd
 mess up any files in ./lib.

 To me the proposed solution sounds like extra effort with little or no 
 benefit.
 
 One benefit I see is that you have only one NOTICE/LICENSE file
 for the source and binary distribution.  
 I had trouble understanding this, until I guessed that you're assuming
 here that we will include these Jars in the src distribution, is this
 right?
 What's more, if your
 source distribution does not include the dependencies and you
 therefore don't mention them in your NOTICE/LICENSE files, it
 might come as a surprise to users that the build pulls in all
 those files they didn't know about (or they don't even notice,
 which would be even worse).
   
 
 I see.  The choices here:
 
 * case 1: having 3rd party Jars checked into SVN
   1.a) shipping these in the src distrib: have to match LIC/NOT pair
 with bin, no surprise
   1.b) not shipping these in the src distrb: different LIC/NOT pair
 for src, potential surprise when building with maven

I don't think 1b is an option.  If they're in svn, they
need to be in the source distribution.

 
 * case 2: not having 3rd party Jars checked into SVN
different LIC/NOT pair for src, potential surprise when building with
 maven
 
 Do you prefer approach 1.a?  That's what I'm now thinking is best.  I
 had been thinking 1.b) because I didn't think through the reasons to
 ship the Jars with the source distribution.
 
 -Marshall
 
 --Thilo

   
 BR,

 Jukka Zitting
 

   


Re: A common approach to sandbox projects distribution of 3rd party jars

2009-09-21 Thread Jukka Zitting
Hi,

On Mon, Sep 21, 2009 at 5:38 PM, Thilo Goetz twgo...@gmx.de wrote:
 Jukka Zitting wrote:
 The LICENSE/NOTICE point that Thilo brought up is a valid one, though
 especially when the default build embeds external dependencies to the
 build target, it's quite OK to include also their licenses in the
 licensing metadata even if those dependencies strictly speaking aren't
 being shipped as a part of the source distribution.

 You think so?  Even with all the nothing goes into a NOTICE
 file unless it has to hubbub?  We don't want to create a new
 maven flame war when we take this release to the IPMC :-)

Yes. The recent PDFBox release used this same approach, and I commented [1]:

The LICENSE and NOTICE files included in META-INF of the
pre-compiled jar cover also external dependencies, which is not
necessary in that context. There's also been some recent Apache
legal discussion (see LEGAL-62) about NOTICE file contents, and
it seems that my earlier understanding resulted in us including too
many details in NOTICE. Neither of these issues is too serious
and certainly not a blocker for the release.

In UIMA the case of using the same LICENSE and NOTICE files for source
and binary distributions is even stronger especially for builds that
by default produce a PEAR package that includes the external
dependencies. Your point about potential confusion caused by different
licensing metadata for the source and binary distributions is also a
valid argument for not having two sets of those files.

[1] http://markmail.org/message/2w742diumw7hooan

BR,

Jukka Zitting


A common approach to sandbox projects distribution of 3rd party jars

2009-09-20 Thread Marshall Schor
I'm going through our build process for the Sandbox Distribution, and
consolidating/unifying many aspects of it (not yet checked in - still a
work in progress).

One area I've examined:  Many sandbox projects rely on and redistribute
3rd party Jars (that are Apache licensed, or are otherwise OK to
distribute).

All (I think) of these Jars are dependencies (in the Maven system) that
maven can automatically download from its repositories, to the .m2 local
repo.

Many sandbox projects put these Jars into the project's lib/ directory.

Some sandbox projects check in these Jars into SVN, and build their
distributions by copying their lib/ dir to the distribution; others use
the maven dependency plugin to get these jars into the local repository
(if not already there) and then copy that into the distribution.

After thinking about this for a while, and considering both methods, I
think the most reliable way to handle 3rd party Jars is to manually put
them into the lib/ directory, once, and then check the lib/ directory
into SVN.  This avoids build issues in the future which could occur if
the Jar obtained from the maven dependency plugin is somehow corrupted,
or changes level, etc.  Also, having the Jars in SVN insures that
whatever work we do to update the LICENSE/NOTICE files for those Jars
remains valid (because the Jar doesn't (potentially) change).

Examples of project copying Jars into its lib/ dir manually:
  BSFAnnotator
  DictionaryAnnotator
  RegularExpressionAnnotator
  SimpleServer
Examples of project using the maven dependency plugin:
  ConfigurableFeatureExtractor (work in progress, pom doing this not yet
checked in)
  Lucas

So, unless there are strong objections, I'm going to be changing the
sandbox build to consistently do the following:
  - for 3rd party dependencies - expect the Jars to be manually put in
the lib/ dir and checked into SVN
  - put the lib/ into the bin distribution
  - if a pear is being built, put the lib/ into the pear
  - not automatically populate the lib/ using the maven dependency
plugin mechanism
  - change the maven descriptors for these dependencies, where needed,
to scopesystem/scope indicating these jars are in the lib/, and add
the systemPaht element.

This places a small burden on the developers when creating a new project
to obtain the needed 3rd party Jars once and put them in the lib/ dir. 
One way to do this is to initially code the maven 3rd party Jar
dependencies with no scope (defaulting thereby to compile) and let
maven get these Jars from searching its repositories.  Then, copy them
from the .m2 local repository to the lib/ dir, and change the scope to
system, and set systemPath${basedir}/lib/.jar/systemPath.

-Marshall