Split up the EntityMentionEnhancementEngine into two separate one
-----------------------------------------------------------------

                 Key: STANBOL-47
                 URL: https://issues.apache.org/jira/browse/STANBOL-47
             Project: Stanbol
          Issue Type: Improvement
          Components: FISE
            Reporter: Fabian Christ
            Assignee: Olivier Grisel


Reported by project member rupert.westenthaler, Jun 16, 2010

Currently the EntityMentionEnhancementEngine does two things:
 - first extract named entities (currently Persons, Organisations and Places) 
from the content. For this the openNLP framework is used
 - second recommend up to three Entities as defined in Wikipedia for the named 
entities found. For this the autotagger component is used.

Expected Result:
1) An engine that provides named entity extraction based on openNLP. This 
engine creates TextAnnotaion type Enhancements.
2) An engine that provides recommendations for entries as defined in dbpedia. 
This engine consumes TextAnnotation type enhancements and produces EntityType 
enhancements.

This would also allow to 
 - use an other natural language processing framework for the named entity 
extraction
 - use other engines to calculate entity recommendations for text annotations.

Comment 2 by project member rupert.westenthaler, Jun 17, 2010

Current state:
There are two new Engines:
 - 
eu.iksproject.fise.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine:
This engine uses openNLP to perform Named Entity extraction. It has still a 
dependency to the configured autotagging provider, because the models of 
openNLP are loaded via this bundle context.

 - eu.iksproject.fise.engines.autotagging.impl.EntityMentionEnhancementEngine
This engine uses the autotagger to calculate entity recommendations for 
TextAnnotations.

Both EnhancementEngines implement the ServiceProperties interface to parse 
information about ordering. 
The NamedEntityExtractionEnhancementEngine needs to run first, because it 
produces the TextAnnotations consumed by the EntityMentionEnhancementEngine.

TODOs: Remove the dependency of the opennlp-ner bundle to the configured 
autotagging provider bundle
Also split up the unit tests



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to