The first pass at separating the umls resources from ASF is ready...  
Basically, developers can just pick and choose the ctakes resources by 
artifictid now.  

Details: The below steps had to be done:
1) UMLS resource project(s) are left behind no sourceforge under new projects:
http://svn.code.sf.net/p/ctakesresources/code/trunk/
[New account and space created for net.sourceforge.ctakesresources]

2) New modules deployed to oss sonatype and maven central:
https://oss.sonatype.org/index.html#nexus-search;quick~ctakesresources
[New account and space created for net.sourceforge.ctakesresources]

3) The appropriate ctakes modules a.k.a ctakes-dictionary-lookup/pom.xml now 
just needs to include:
                        <dependency>
                                
<groupId>net.sourceforge.ctakesresources</groupId>
                                
<artifactId>ctakes-resources-umls2011ab</artifactId>
                                <version>3.0.0</version>
                        </dependency>
4) Finally to make it transparent for developers, added the 
maven-dependency-plugin:unpack-dependencies to unzip them into target.  This is 
because things like Lucene need them to be unpacked files rather than within a 
jar.
4a) End users could just download the resources zip file from 
https://sourceforge.net/projects/ctakesresources/files/ and add it to their 
resources folder and provide their umls username/pw during execution.

Note: Only the umls resources have been separated now due to the ASF licensing 
incompatibilities, but other projects should be able to do the same using this 
mechanism.

--Pei

> -----Original Message-----
> From: Jörn Kottmann [mailto:[email protected]]
> Sent: Monday, November 05, 2012 7:42 AM
> To: [email protected]
> Subject: Re: [DISCUSS] What should we do with cTAKES resources?
> 
> In my opinion we should release what we can from here at Apache and only
> the resources which have an incompatible license need to be handled
> differently, e.g. external site.
> 
> Models which are trained on private clinical data can be released as long as
> the original creator decides to license them under AL 2.0. If that is done by 
> a
> committer it should be fine to just check them in or put them on the website.
> 
> The wikipedia license is compatible and an index of it as well, but we
> probably need to have attributio for it in a NOTICE file, and maybe include
> the license in the LICENSE file.
> 
> Jörn
> 
> On 11/02/2012 10:46 PM, Chen, Pei wrote:
> > I think we postponed this topic previously and since the ASF code seems to
> be in decent shape now, I think it's time to revisit this discussion for the
> longer term.
> > Currently, we have the below resources bundled with our source code
> > and distribution
> >
> > -          UMLS dictionaries (hsqldb format and in lucene indexes)
> >
> > -          Models (which were okay be to release opened source) that have
> been train from various clinical data
> >
> > -          Wikipedia index
> >
> > What are our options as ASF source code, binaries, models,
> > dependencies all need to be compliant with ASL 2.0
> > (http://www.apache.org/legal/3party.html)
> >
> > 1)      Leave things as they are, but we need to confirm with the sources 
> > and
> also will probably need to seek approval from Apache Legal for each of the
> resources
> >
> > 2)      Host the resources externally such as SourceForge similar to OpenNLP
> models (http://opennlp.sourceforge.net/models-1.5/)
> >
> > a.       Single zip per release for users to download?
> >
> > Option 2 seems the least painful in terms of compliance.
> > Since 3.0.0-incubating, each resource has a fully qualified name/path and is
> read from the classpath so it should be fairly easy if we decided to pull it 
> in
> from external sources.
> >
> > --Pei
> >
> >

Reply via email to