[
https://issues.apache.org/jira/browse/STANBOL-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796727#comment-13796727
]
Rupert Westenthaler commented on STANBOL-1013:
----------------------------------------------
This will also remove the dependency of the FstLinkingEngine to the
EntityLinkingEngine.
As part of the work on the FstLinkingEngine (STANBOL-1128) the Entity Spotting
part has been made more modular. This should further support this issue.
> Seperate (Entity)Spotting and (Entity)Linking
> ---------------------------------------------
>
> Key: STANBOL-1013
> URL: https://issues.apache.org/jira/browse/STANBOL-1013
> Project: Stanbol
> Issue Type: Improvement
> Components: Enhancement Engines, Enhancer
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
>
> Currently the EntityLinking engine performs two major tasks
> (1) Spotting: detect the words in the analyzed Text that should be linked to
> the controlled Vocabulary. For that words are categorized as "linkable",
> "matchable" and "others". Also Chunks are considered for this task.
> (2) Linking: Creates searches for "linkable" words while considering
> "matchable" words. Labels of suggested Entities are tokenized and matched
> against "linkable" and "matchable" words in the text. The
> EntityLinkingConfiguration ise used to configure this task.
> See the documentation of the EntityLinkingEngine [1] for details.
> (1) is configured by using the TextProcessingConfiguration and implemented by
> the ProcessingState class. (2) is configured by the
> EntityLinkingConfiguration and implemented by the EntityLinker class.
> Proposed Workplan:
> =====
> 1. clean-up and improve the internal APIs used by the EntityLinking engine
> 2. define a public API for describing Entity Spotting results: Possibilities
> include
> * using the metadata of the ContentItems (e.g. fise:TextAnnotations)
> * annotations in the AnalyzedText contentpart
> * some additional ContentPart
> 3 Split-up (1) and (2) as two separate EnhancementEngines so that
> * (1) NlpSpottingEngine: Spots potential Entities by using NLP processing
> results
> * (2) EntityLinkingEngine: Links Entities of a Controlled Vocabulary based
> on Spotting results
> 4. Integrate Named Entity Linking into the new Spotting & Linking workflow
> * By allowing Spotters to also annotate spotted Entities to carry
> additional metadata (e.g. the type as suggested by NER)
> * Extending the EntityLinkingEngine to make use of those metadata when
> searching/matching Entities from linked Vocabularies.
> [1]
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
--
This message was sent by Atlassian JIRA
(v6.1#6144)