Re: A common approach to sandbox projects distribution of 3rd party jars
Hi Marshall, I mostly agree with you for putting binaries into lib dir from SVN for the release because we would like the checkout/compile phase to be as easy and quick as possible. Though it could be somehow annoying I think it would be nice in the near future (after the release) to leave those binaries out and use long time stable artifact versions from the central Maven repository where possible. Tommaso 2009/9/20 Marshall Schor m...@schor.com I'm going through our build process for the Sandbox Distribution, and consolidating/unifying many aspects of it (not yet checked in - still a work in progress). One area I've examined: Many sandbox projects rely on and redistribute 3rd party Jars (that are Apache licensed, or are otherwise OK to distribute). All (I think) of these Jars are dependencies (in the Maven system) that maven can automatically download from its repositories, to the .m2 local repo. Many sandbox projects put these Jars into the project's lib/ directory. Some sandbox projects check in these Jars into SVN, and build their distributions by copying their lib/ dir to the distribution; others use the maven dependency plugin to get these jars into the local repository (if not already there) and then copy that into the distribution. After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). Examples of project copying Jars into its lib/ dir manually: BSFAnnotator DictionaryAnnotator RegularExpressionAnnotator SimpleServer Examples of project using the maven dependency plugin: ConfigurableFeatureExtractor (work in progress, pom doing this not yet checked in) Lucas So, unless there are strong objections, I'm going to be changing the sandbox build to consistently do the following: - for 3rd party dependencies - expect the Jars to be manually put in the lib/ dir and checked into SVN - put the lib/ into the bin distribution - if a pear is being built, put the lib/ into the pear - not automatically populate the lib/ using the maven dependency plugin mechanism - change the maven descriptors for these dependencies, where needed, to scopesystem/scope indicating these jars are in the lib/, and add the systemPaht element. This places a small burden on the developers when creating a new project to obtain the needed 3rd party Jars once and put them in the lib/ dir. One way to do this is to initially code the maven 3rd party Jar dependencies with no scope (defaulting thereby to compile) and let maven get these Jars from searching its repositories. Then, copy them from the .m2 local repository to the lib/ dir, and change the scope to system, and set systemPath${basedir}/lib/.jar/systemPath. -Marshall
Re: A common approach to sandbox projects distribution of 3rd party jars
Hi, On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote: After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). By policy non-SNAPSHOT artifact in the Maven repository never change, and each artifact is accompanied by checksums that guard against corruption. It's possible for a user to mess up the files in their local Maven repository, but it's probably just as likely that they'd mess up any files in ./lib. To me the proposed solution sounds like extra effort with little or no benefit. BR, Jukka Zitting
Re: A common approach to sandbox projects distribution of 3rd party jars
Jukka Zitting wrote: Hi, On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote: After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). By policy non-SNAPSHOT artifact in the Maven repository never change, and each artifact is accompanied by checksums that guard against corruption. It's possible for a user to mess up the files in their local Maven repository, but it's probably just as likely that they'd mess up any files in ./lib. To me the proposed solution sounds like extra effort with little or no benefit. One benefit I see is that you have only one NOTICE/LICENSE file for the source and binary distribution. What's more, if your source distribution does not include the dependencies and you therefore don't mention them in your NOTICE/LICENSE files, it might come as a surprise to users that the build pulls in all those files they didn't know about (or they don't even notice, which would be even worse). --Thilo BR, Jukka Zitting
Re: A common approach to sandbox projects distribution of 3rd party jars
Jukka Zitting wrote: Hi, On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote: After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). By policy non-SNAPSHOT artifact in the Maven repository never change, and each artifact is accompanied by checksums that guard against corruption. It's possible for a user to mess up the files in their local Maven repository, but it's probably just as likely that they'd mess up any files in ./lib. To me the proposed solution sounds like extra effort with little or no benefit. Interesting. Is there any *enforcement* of the policy of not updating non-SNAPSHOT artifacts (with their checksums)? For instance, if someone (say, a non-apache project, by accident) runs a maven script to deploy a changed artifact (and its checksum) to maven central, would that be blocked somehow? It also seems that if there is this policy, that the default for the maven repository control for *releases* would have an updatePolicynever/updatePolicy in the maven SuperPom (relying on the policy that the repository artifact is never updated), but it isn't there. There is an updatePolicynever/updatePolicy in the SuperPom for its own plugin-repository, though. -Marshall BR, Jukka Zitting
Re: A common approach to sandbox projects distribution of 3rd party jars
Jukka Zitting wrote: ... The LICENSE/NOTICE point that Thilo brought up is a valid one, though especially when the default build embeds external dependencies to the build target, it's quite OK to include also their licenses in the licensing metadata even if those dependencies strictly speaking aren't being shipped as a part of the source distribution. You think so? Even with all the nothing goes into a NOTICE file unless it has to hubbub? We don't want to create a new maven flame war when we take this release to the IPMC :-) However in that case I guess I would also prefer to have maven fetch the dependencies. Seems wasteful to have them in svn. Not sure how much work it is to change the build, though... --Thilo
Re: A common approach to sandbox projects distribution of 3rd party jars
See my other mail in this thread, but also inline below. --Thilo Marshall Schor wrote: Thilo Goetz wrote: Jukka Zitting wrote: Hi, On Sun, Sep 20, 2009 at 10:19 PM, Marshall Schor m...@schor.com wrote: After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). By policy non-SNAPSHOT artifact in the Maven repository never change, and each artifact is accompanied by checksums that guard against corruption. It's possible for a user to mess up the files in their local Maven repository, but it's probably just as likely that they'd mess up any files in ./lib. To me the proposed solution sounds like extra effort with little or no benefit. One benefit I see is that you have only one NOTICE/LICENSE file for the source and binary distribution. I had trouble understanding this, until I guessed that you're assuming here that we will include these Jars in the src distribution, is this right? What's more, if your source distribution does not include the dependencies and you therefore don't mention them in your NOTICE/LICENSE files, it might come as a surprise to users that the build pulls in all those files they didn't know about (or they don't even notice, which would be even worse). I see. The choices here: * case 1: having 3rd party Jars checked into SVN 1.a) shipping these in the src distrib: have to match LIC/NOT pair with bin, no surprise 1.b) not shipping these in the src distrb: different LIC/NOT pair for src, potential surprise when building with maven I don't think 1b is an option. If they're in svn, they need to be in the source distribution. * case 2: not having 3rd party Jars checked into SVN different LIC/NOT pair for src, potential surprise when building with maven Do you prefer approach 1.a? That's what I'm now thinking is best. I had been thinking 1.b) because I didn't think through the reasons to ship the Jars with the source distribution. -Marshall --Thilo BR, Jukka Zitting
Re: A common approach to sandbox projects distribution of 3rd party jars
Hi, On Mon, Sep 21, 2009 at 5:38 PM, Thilo Goetz twgo...@gmx.de wrote: Jukka Zitting wrote: The LICENSE/NOTICE point that Thilo brought up is a valid one, though especially when the default build embeds external dependencies to the build target, it's quite OK to include also their licenses in the licensing metadata even if those dependencies strictly speaking aren't being shipped as a part of the source distribution. You think so? Even with all the nothing goes into a NOTICE file unless it has to hubbub? We don't want to create a new maven flame war when we take this release to the IPMC :-) Yes. The recent PDFBox release used this same approach, and I commented [1]: The LICENSE and NOTICE files included in META-INF of the pre-compiled jar cover also external dependencies, which is not necessary in that context. There's also been some recent Apache legal discussion (see LEGAL-62) about NOTICE file contents, and it seems that my earlier understanding resulted in us including too many details in NOTICE. Neither of these issues is too serious and certainly not a blocker for the release. In UIMA the case of using the same LICENSE and NOTICE files for source and binary distributions is even stronger especially for builds that by default produce a PEAR package that includes the external dependencies. Your point about potential confusion caused by different licensing metadata for the source and binary distributions is also a valid argument for not having two sets of those files. [1] http://markmail.org/message/2w742diumw7hooan BR, Jukka Zitting
A common approach to sandbox projects distribution of 3rd party jars
I'm going through our build process for the Sandbox Distribution, and consolidating/unifying many aspects of it (not yet checked in - still a work in progress). One area I've examined: Many sandbox projects rely on and redistribute 3rd party Jars (that are Apache licensed, or are otherwise OK to distribute). All (I think) of these Jars are dependencies (in the Maven system) that maven can automatically download from its repositories, to the .m2 local repo. Many sandbox projects put these Jars into the project's lib/ directory. Some sandbox projects check in these Jars into SVN, and build their distributions by copying their lib/ dir to the distribution; others use the maven dependency plugin to get these jars into the local repository (if not already there) and then copy that into the distribution. After thinking about this for a while, and considering both methods, I think the most reliable way to handle 3rd party Jars is to manually put them into the lib/ directory, once, and then check the lib/ directory into SVN. This avoids build issues in the future which could occur if the Jar obtained from the maven dependency plugin is somehow corrupted, or changes level, etc. Also, having the Jars in SVN insures that whatever work we do to update the LICENSE/NOTICE files for those Jars remains valid (because the Jar doesn't (potentially) change). Examples of project copying Jars into its lib/ dir manually: BSFAnnotator DictionaryAnnotator RegularExpressionAnnotator SimpleServer Examples of project using the maven dependency plugin: ConfigurableFeatureExtractor (work in progress, pom doing this not yet checked in) Lucas So, unless there are strong objections, I'm going to be changing the sandbox build to consistently do the following: - for 3rd party dependencies - expect the Jars to be manually put in the lib/ dir and checked into SVN - put the lib/ into the bin distribution - if a pear is being built, put the lib/ into the pear - not automatically populate the lib/ using the maven dependency plugin mechanism - change the maven descriptors for these dependencies, where needed, to scopesystem/scope indicating these jars are in the lib/, and add the systemPaht element. This places a small burden on the developers when creating a new project to obtain the needed 3rd party Jars once and put them in the lib/ dir. One way to do this is to initially code the maven 3rd party Jar dependencies with no scope (defaulting thereby to compile) and let maven get these Jars from searching its repositories. Then, copy them from the .m2 local repository to the lib/ dir, and change the scope to system, and set systemPath${basedir}/lib/.jar/systemPath. -Marshall