Split up the EntityMentionEnhancementEngine into two separate one
-----------------------------------------------------------------
Key: STANBOL-47
URL: https://issues.apache.org/jira/browse/STANBOL-47
Project: Stanbol
Issue Type: Improvement
Components: FISE
Reporter: Fabian Christ
Assignee: Olivier Grisel
Reported by project member rupert.westenthaler, Jun 16, 2010
Currently the EntityMentionEnhancementEngine does two things:
- first extract named entities (currently Persons, Organisations and Places)
from the content. For this the openNLP framework is used
- second recommend up to three Entities as defined in Wikipedia for the named
entities found. For this the autotagger component is used.
Expected Result:
1) An engine that provides named entity extraction based on openNLP. This
engine creates TextAnnotaion type Enhancements.
2) An engine that provides recommendations for entries as defined in dbpedia.
This engine consumes TextAnnotation type enhancements and produces EntityType
enhancements.
This would also allow to
- use an other natural language processing framework for the named entity
extraction
- use other engines to calculate entity recommendations for text annotations.
Comment 2 by project member rupert.westenthaler, Jun 17, 2010
Current state:
There are two new Engines:
-
eu.iksproject.fise.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine:
This engine uses openNLP to perform Named Entity extraction. It has still a
dependency to the configured autotagging provider, because the models of
openNLP are loaded via this bundle context.
- eu.iksproject.fise.engines.autotagging.impl.EntityMentionEnhancementEngine
This engine uses the autotagger to calculate entity recommendations for
TextAnnotations.
Both EnhancementEngines implement the ServiceProperties interface to parse
information about ordering.
The NamedEntityExtractionEnhancementEngine needs to run first, because it
produces the TextAnnotations consumed by the EntityMentionEnhancementEngine.
TODOs: Remove the dependency of the opennlp-ner bundle to the configured
autotagging provider bundle
Also split up the unit tests
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.