Thanks Rupert for these clarification. One thing that still isn't clear. You say that the EntityLinking engines operate on a single toke, while named entity tagging works on pharses. What does this mean, I see that EntityLinking detects multiple word entities. What are the cases EntityLinking cannot handle?
Cheers, Reto On Mon, May 20, 2013 at 2:05 PM, Rupert Westenthaler < [email protected]> wrote: > On Mon, May 20, 2013 at 12:34 PM, Reto Bachmann-Gmür <[email protected]> > wrote: > > Named Entity Tagging Engine: This creates entity references exclusively > for > > substrings identied to denote a person, people or place by the named > entity > > recognizer. > > Correct. This Engine can use type restrictions based on the types > detected by NER when linking against the Vocabularies. In addition it > also searches for Entities matching the "phrase" detected as Named > Entities. The EntityLinking engine operates on single Tokens. > > > > > Entityhub Linking Engine: This creates the entity references using the > > results of NLP processing. Only some lexical categories are processed, > > these are determined by the parameter in "Processed Languages" as well as > > with the "Link ProperNouns only". > > > > The Entityhub Linking Engine is a configuration of the > EntityLinkingEngine that uses the Entityhub to search for Entities in > the controlled vocabulary. It does not implement any linking > functionality itself. > > > > Keyword Linking Engine: "An engine that extracts keywords present within > a > > Controlled Vocabulary mentioned within parsed ContentItem". I assumed > this > > would just link any matching word sequences without requiring any NLP > > (except word tokenization). However the config pane say that the > parameter > > "Min Token length" is ignored in case a POS (Part of Speech) tagger is > > available for the language of the parsed content. So is this using NLP as > > well? > > > > This engine is deprecated. Its the predecessor of the Entity Linking > Engiine > > > > So this are the 3 Engines I find in the configuration. Then there's also > > the EntityLinkingEngine according to > > > https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking > > > > This implements the Entity Linking process. To use it one needs to > provide implementations of the extension points (EntitySearcher and > LabelTokenizer). > > > Confusingly https://stanbol.apache.org/docs/trunk/customvocabulary.html > > distinguishes > > between Named Entity Linking for which it refers to the Named Entity > > Tagging Engine and Keyword Linking for which it doesn't refer to the > > "Keyword Linking Engine" but to "Entityhub linking engine" (the document > > has some issues: STANBOL-1075). > > "Keyword Linking" should no longer be used. "Named Entity Linking" and > "Entity Linking" are the preferred terms. > > You are right. The "Working with Custom Vocabularies" does have some > inconsistencies in the last part. "2. Keyword Linking" should be "2. > Entity Linking" and also the 2nd heading "Configuring Named Entity > Linking" should note "Configuring Entity Linking" instead. > > best > Rupert > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >
