Hi

I will start to work on that in addition to completing STANBOL-141 and
resolving STANBOL-287 because this three things are connected to each
other.

best
Rupert

On Mon, Jul 18, 2011 at 1:39 PM, Fabian Christ
<[email protected]> wrote:
> 2011/7/18 Rupert Westenthaler <[email protected]>:
>> Hi
>>
>> Currently the "org.apache.stanbol.defaultdata" bundle contains all
>> data needed by the Stanbol Launchers.
>> This basically includes things:
>>
>> 1. OpenNLP sentence detection for english (used by the opennlp.ner
>> engine, taxonomylinking engine)
>> 2. OpenNLP POS model for english (used by the taxonomylinking engine)
>> 3. OpenNLP name finder models for location, places and organizations
>> for the english language (used by the opennlp.ner engine)
>> 4. Default DBPedia configuration consisting of a 43k entities dbpedia
>> index as well as SolrYard, Cache, ReferencedSite and
>> EntityLinkingEngine configuration
>>
>> Having all this in a single bundle makes it hard to change/remove
>> parts of the default configurations without also affecting other
>> components
>>
>> Because of this I suggest to remove the defaultdata bundle in the
>> current form and instead create several more focused bundles within
>> the {stanbol-trunk}/data folder.
>> The "default data" would than be determined by the "data" bundles
>> referenced in the bundle list.xml files of the different launchers.
>>
>> Currently I would suggest to use three bundles
>>
>> 1) OpenNLP models for en (Sentence, POS)
>> 2) OpenNLP name finder models for en (location, organization, places)
>> 3) DBPedia.org default configuration
>>
>> For users it would than be easily possible to deactivate parts of the
>> default configuration (e.g. the DBPedia related stuff) by simple
>> stopping or uninstalling the according bundles.
>>
>> WDYT
>
> Yes, +1
>
> --
> Fabian
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to