Hi I will start to work on that in addition to completing STANBOL-141 and resolving STANBOL-287 because this three things are connected to each other.
best Rupert On Mon, Jul 18, 2011 at 1:39 PM, Fabian Christ <[email protected]> wrote: > 2011/7/18 Rupert Westenthaler <[email protected]>: >> Hi >> >> Currently the "org.apache.stanbol.defaultdata" bundle contains all >> data needed by the Stanbol Launchers. >> This basically includes things: >> >> 1. OpenNLP sentence detection for english (used by the opennlp.ner >> engine, taxonomylinking engine) >> 2. OpenNLP POS model for english (used by the taxonomylinking engine) >> 3. OpenNLP name finder models for location, places and organizations >> for the english language (used by the opennlp.ner engine) >> 4. Default DBPedia configuration consisting of a 43k entities dbpedia >> index as well as SolrYard, Cache, ReferencedSite and >> EntityLinkingEngine configuration >> >> Having all this in a single bundle makes it hard to change/remove >> parts of the default configurations without also affecting other >> components >> >> Because of this I suggest to remove the defaultdata bundle in the >> current form and instead create several more focused bundles within >> the {stanbol-trunk}/data folder. >> The "default data" would than be determined by the "data" bundles >> referenced in the bundle list.xml files of the different launchers. >> >> Currently I would suggest to use three bundles >> >> 1) OpenNLP models for en (Sentence, POS) >> 2) OpenNLP name finder models for en (location, organization, places) >> 3) DBPedia.org default configuration >> >> For users it would than be easily possible to deactivate parts of the >> default configuration (e.g. the DBPedia related stuff) by simple >> stopping or uninstalling the according bundles. >> >> WDYT > > Yes, +1 > > -- > Fabian > -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
