[ 
https://issues.apache.org/jira/browse/STANBOL-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rupert Westenthaler resolved STANBOL-686.
-----------------------------------------

    Resolution: Fixed
      Assignee: Rupert Westenthaler

fixed with revision 1360296
                
> Make the "Minimum Token Match Factor" configurable for the 
> KeywordLinkingEngine
> -------------------------------------------------------------------------------
>
>                 Key: STANBOL-686
>                 URL: https://issues.apache.org/jira/browse/STANBOL-686
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Engine - KeywordExtraction
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>
> If a Token of the text is compared with a Token in the Label of an Entity the 
> similarity of those is expressed in the range [0..1]. This factor specifies 
> the minimum similarity of two Tokens so that they are considered to match. 
> Lower values will allow more Tokens to match (e.g inflected forms of words) 
> but may also result in false positives. Regardless of the configured value 
> the similarity will influence the confidence of suggestions.
> BTW: currently the similarity match is calculated by dividing the 
> longest-matching-section of two tokens with the length of the longer of the 
> two tokens.
> e.g. Austrian <-> Austria
> match: Austria -> 6
> max length: 7
> similarity: 6/7=0.857

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to