Author: rwesten
Date: Tue Nov 19 14:21:38 2013
New Revision: 1543437
URL: http://svn.apache.org/r1543437
Log:
STANBOL-1211: added documentation for the Minimum Chunk Match Score option
added by this issue
Modified:
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.mdtext
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.mdtext
Modified:
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.mdtext
URL:
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.mdtext?rev=1543437&r1=1543436&r2=1543437&view=diff
==============================================================================
---
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.mdtext
(original)
+++
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.mdtext
Tue Nov 19 14:21:38 2013
@@ -2,7 +2,7 @@ Title: The Entityhub Linking Engine: Lin
The EntityhubLinkingEngine is the successor of the
[KeywordLinkingEngine](keywordlinkingengine). It is based on the
[EntityLinkingEngine](entitylinking) configured with an
[EntitySearcher](entitylinking#entitysearcher) that can link Entities managed
by either the Entityhub, ReferencedSites as well as ManagedSites. The
EntityhubLinkingEngine does not implement the [EnhancementEngine](index)
interface itself. It only configures an instance of the
[EntityLinkingEngine](entitylinking).
-For a detailed documentation of the linking process please see the
documentation of the [EntityLinkingEngine](entitylinkingengine). This document
only focuses on the configuration and the usage of this Engine.
+For a detailed documentation of the linking process please see the
documentation of the [EntityLinkingEngine](entitylinking). This document only
focuses on the configuration and the usage of this Engine.
## Configuration
@@ -16,7 +16,7 @@ Next it allows to configure the used Ent
* __Referenced Site__ _(enhancer.engines.linking.entityhub.siteId)_: The name
of the ReferencedSite of the Stanbol Entityhub that holds the controlled
vocabulary to be used for extracting Entities. "entityhub" or "local" can be
used to extract Entities managed directly by the Entityhub.
-Finally it supports all configuration options supported by the
[EntityLinkingEngine](entitylinkingengine).
+Finally it supports all configuration options supported by the
[EntityLinkingEngine](entitylinking).
* [Text Processing
Configuration](entitylinking#text-processing-configuration): This defines what
languages are enabled and is also used to configure how NLP processing results
are used by the Engine
* [Entity Linking Configuration](entitylinking#entity-linker-configuration):
This defines how entity are searched in the vocabulary and search results are
matched with the text. It also allows to configure 'dc:type's for created
'fise:TextAnnotation's and if entity information are included in the
enhancement results or not.
Modified:
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.mdtext
URL:
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.mdtext?rev=1543437&r1=1543436&r2=1543437&view=diff
==============================================================================
---
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.mdtext
(original)
+++
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.mdtext
Tue Nov 19 14:21:38 2013
@@ -202,6 +202,7 @@ The following properties define how Link
The parameters below are used to configure the matching process.
+* __Minimum Chunk Match Score__
_(enhancer.engines.linking.minChunkMatchScore)_: If the mention of an Entity is
within a Chunk (e.g. a Noun Phrase) this specifies the minimum percentage of
Tokens the detected Entity must match to be accepted. Only matchable tokens of
phrases are counted (e.g. for the `lovely Julia Roberts` only `Julia Roberts`
would count as lovely is an adjective). By default this is set to `0.51` so an
Entity with a label `Julia` would not be accepted. _NOTE:_ This only considers
'processable' chunks. Because of that it depends also on the _pc_ parameter of
the Language Processing configuration; This feature was introduced with
[STANBOL-1211](https://issues.apache.org/jira/browse/STANBOL-1211).
* __Minimum Token Match Score__ _(enhancer.engines.linking.minTokenScore)_:
This defines how well single tokens of the text need to match single tokens in
the label so that they are considered as matching. This parameter configures
the lower limit. However the actual token match score does also influence the
overall matching scores for labels with the text. So non exact matches will
decrease matching scores for the whole label with the text.
* __Min Label Score__ _(enhancer.engines.linking.minLabelScore)_
[0..1]::double: The "Label Score" [0..1] represents how much of the Label of an
Entity matches with the Text. It compares the number of Tokens of the Label
with the number of Tokens matched to the Text. Not exact matches for Tokens, or
if the Tokens within the label do appear in an other order than in the text do
also reduce this score. Entities are only considered if at least one of their
labels cores higher than the minimum for all tree of _Min Labe Score_, _Min
Text Match Score_ and _Min Match Score_.
* __Min Matched Tokens__ _(enhancer.engines.linking.minFoundTokens)_
[1..*]::int: The minimum number of matching tokens. Only "matchable" tokens are
counted. For full matches (where all tokens of the Label do match tokens in the
text) this parameter is ignored.