[jira] [Resolved] (STANBOL-1229) Convert all OpenNLP Enhancement Engines to Configuration Factories

Rupert Westenthaler (JIRA) Tue, 03 Dec 2013 01:17:09 -0800

     [ 
https://issues.apache.org/jira/browse/STANBOL-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rupert Westenthaler resolved STANBOL-1229.
------------------------------------------

    Resolution: Fixed

fixed with http://svn.apache.org/r1547321 for both trunk and 0.12

> Convert all OpenNLP Enhancement Engines to Configuration Factories
> ------------------------------------------------------------------
>
>                 Key: STANBOL-1229
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1229
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Enhancement Engines
>    Affects Versions: 0.12.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>             Fix For: 0.12.0
>
>
> Currently the OpenNLP Sentence Detection and Tokenizer Enhancement Engines do 
> not support OSGI Configuration Factories. Because of that they do only allow 
> a single instance.
> However this can create problems if one wants to configure multiple 
> Enhancement Chains with different NLP frameworks. 
> Here an example
> Chain1:
>  * OpenNLP for English, German and Spanish
> Chain2:
>  * Stanford NLP for English
>  * OpenNLP for German
>  * Freeling NLP for Spanish
> As OpenNLP does support all three mentioned languages a user would like to 
> configure the following Engines configurations for OpenNLP:
> 1. OpenNLP engines for sentence detection, tokenization, POS tagging and 
> Chunking that include all three languages.
> 2. OpenNLP engines that only process German language texts for sentence 
> detection, tokenization, POS tagging and Chunking
> 3. RESTful NLP Analysis Engine calling StanfordNLP for English language texts
> 4. RESTful NLP Analysis Engine calling Freeling for Spanish language texts
> Chain1 would use the OpenNLP engines configured to process all languages 
> while Chain 2 would use the engine configurations listed under point 2 to 4.
> However as the OpenNLP Tokenizer and Sentence detection engine do not support 
> OSGI Configuration Factories this is currently not possible as only a single 
> Engine instance of those two engines can be configured.
> Because of that English and Spanish Text sent to Chain2 would be processed by 
> two Sentence Detectors and Tokenizers and this results in duplicate Sentence 
> and Token annotations.
> Adding support for OSGI Configuration Factories to all OpenNLP 
> EnhancementEngines will solve this issue. Existing Configurations will be not 
> affected as all engines do already use "ConfigurationPolicy.OPTIONAL" - 
> meaning that a default instance with the default configuration is created 
> automatically.
> This Issues affects both the trunk as well as the 0.12 releasing branch



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (STANBOL-1229) Convert all OpenNLP Enhancement Engines to Configuration Factories

Reply via email to