Hello, I'm a bit confused about (named) entity(hub) linking / keyword linking engines:
Here's my understanding: Named Entity Tagging Engine: This creates entity references exclusively for substrings identied to denote a person, people or place by the named entity recognizer. Entityhub Linking Engine: This creates the entity references using the results of NLP processing. Only some lexical categories are processed, these are determined by the parameter in "Processed Languages" as well as with the "Link ProperNouns only". Keyword Linking Engine: "An engine that extracts keywords present within a Controlled Vocabulary mentioned within parsed ContentItem". I assumed this would just link any matching word sequences without requiring any NLP (except word tokenization). However the config pane say that the parameter "Min Token length" is ignored in case a POS (Part of Speech) tagger is available for the language of the parsed content. So is this using NLP as well? So this are the 3 Engines I find in the configuration. Then there's also the EntityLinkingEngine according to https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking Confusingly https://stanbol.apache.org/docs/trunk/customvocabulary.html distinguishes between Named Entity Linking for which it refers to the Named Entity Tagging Engine and Keyword Linking for which it doesn't refer to the "Keyword Linking Engine" but to "Entityhub linking engine" (the document has some issues: STANBOL-1075). Some clarification like a comparison of the 3 or 4 engines or a glossary enumerating the interchangeably used terms would be greatly appreciated. Cheers, Reto
