[
https://issues.apache.org/jira/browse/STANBOL-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rupert Westenthaler resolved STANBOL-686.
-----------------------------------------
Resolution: Fixed
Assignee: Rupert Westenthaler
fixed with revision 1360296
> Make the "Minimum Token Match Factor" configurable for the
> KeywordLinkingEngine
> -------------------------------------------------------------------------------
>
> Key: STANBOL-686
> URL: https://issues.apache.org/jira/browse/STANBOL-686
> Project: Stanbol
> Issue Type: Improvement
> Components: Engine - KeywordExtraction
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
>
> If a Token of the text is compared with a Token in the Label of an Entity the
> similarity of those is expressed in the range [0..1]. This factor specifies
> the minimum similarity of two Tokens so that they are considered to match.
> Lower values will allow more Tokens to match (e.g inflected forms of words)
> but may also result in false positives. Regardless of the configured value
> the similarity will influence the confidence of suggestions.
> BTW: currently the similarity match is calculated by dividing the
> longest-matching-section of two tokens with the length of the longer of the
> two tokens.
> e.g. Austrian <-> Austria
> match: Austria -> 6
> max length: 7
> similarity: 6/7=0.857
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira