[
https://issues.apache.org/jira/browse/STANBOL-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180353#comment-13180353
]
Rupert Westenthaler commented on STANBOL-102:
---------------------------------------------
I will implement this similar to the KeywordLinkingEngine.
Two Options:
* Default Language: If configured this is used as default if no language was
detected for a text (e.g. if no language detection engine is active)
* Processed Languages: Allows to configure a list of languages that are
processed by an engine instance. If empty or not present all languages are
processed. This allows to create multiple instances of the NER engine (with
different configurations) that do only process some specific languages.
In addition I will change this Entinge to use the ConfigurationFactory. This
will allow multiple instances to be configured and include a default
configuration with the default values for default language (none) and processed
languages (any) within the stanbol launchers.
The base framework that allows to dynamically load OpenNLP NER models for
different languages was already implemented in the meantime by the OpenNLP
utility (part of org.apache.stanbol.commons.opennlp module).
> Make the NER enhancement engine able to use different models for different
> languages
> ------------------------------------------------------------------------------------
>
> Key: STANBOL-102
> URL: https://issues.apache.org/jira/browse/STANBOL-102
> Project: Stanbol
> Issue Type: Improvement
> Reporter: Olivier Grisel
> Assignee: Olivier Grisel
>
> Currently, the list of models is hardcoded: it uses
> en-{person,location,organization}-ner.bin in a hardcoded way. The engine
> should be adapted to be able to lookup other models (following the
> {language-code}-{entity-class}-ner.bin filename pattern) according to the
> language of the text. If no such model is found, the engine should refuse
> compute enhancement instead of using the wrong model which will output
> spurious annotations.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira