Am 10.06.2012 um 00:44 schrieb James Kosin: > > It is one of the things we are working on. The problem is most if not > all the models are currently trained on copyrighted material that > restricts the usage of the resulting trained data to research purposes ONLY. > We currently host the models on another site; due to this limitation and > the licensing conflict that would result if we tried to host on Apache. > > You are more than welcome to help, if you choose.
I'm working on the DKPro Core [1] project (UIMA-based NLP components). The project integrates a growing number of different NLP tools into a common interoperable framework. We've also started integrating OpenNLP now. We figured that our preferred API and way of using UIMA is sufficiently different from OpenNLP's UIMA integration that we started doing our own. Well, so much for the background. We have a public Artifactory (Maven repository) up and running on which we host the our Open Source stuff that we cannot put on Maven Central for one reason or the other. We wouldn't mind hosting additional models as long as redistribution is not explicitly prohibited. Actually, we do already host several of the OpenNLP models [2] in that Maven repository. We do not simply host the bin files, but wrap them up in JARs again which makes it easier to add them as Maven dependencies and load them from the classpath. So if you are looking for a place to drop redistributable OpenNLP models (research only is ok for us), feel free to drop me a note. The only thing we ask for is some information regarding the license and redistributability, so we can make sure redistribution is not explicitly generally prohibited. Feel free to use the wrapped models we already have as Maven dependencies in your own projects. The model JARs contain the bin and a bit of metadata. If you like the wrapped models and need other models wrapped, just tell me. -- Richard [1] http://code.google.com/p/dkpro-core-asl/ [2] https://zoidberg.ukp.informatik.tu-darmstadt.de/artifactory/webapp/search/artifact?q=opennlp-model -- ------------------------------------------------------------------- Richard Eckart de Castilho Technical Lead Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer Science Department Technische Universität Darmstadt Hochschulstr. 10, D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117 eck...@ukp.informatik.tu-darmstadt.de www.ukp.tu-darmstadt.de Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de -------------------------------------------------------------------