[ 
https://issues.apache.org/jira/browse/STANBOL-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rupert Westenthaler resolved STANBOL-1252.
------------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.12.0

Implemented with http://svn.apache.org/r1557037 in trunk and merged back to 
0.12 with http://svn.apache.org/r1557044

> Add support for MIN_FOUND_TOKENS to the Lucene FST Linking Engine
> -----------------------------------------------------------------
>
>                 Key: STANBOL-1252
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1252
>             Project: Stanbol
>          Issue Type: Improvement
>    Affects Versions: 0.12.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>             Fix For: 0.12.0
>
>
> The FST linking engine already allows to configure in percentage how much of 
> a processable chunk (typically noun phrases) need to match so that a 
> suggestion is accepted. This is done by using the 
> "enhancer.engines.linking.minChunkMatchScore" property. The default is > 50%.
> While this way of configuration is great for chunks created by 
> NamedEntityAnnotations it is not always well suited for detected noun phrases 
> as those may select larger sections of a sentence. E.g. "goalie Mathias Lange 
> (Iserlohn Roosters)" will not match any Entity in a vocabulary as it contains 
> 5 matchable tokens but both the player "Mathias Lange" and the Team name 
> "Iserlohn Roosters" do only represent two of them.
> In such cases the configuration of a fixed lower limit of the number of 
> (matchable) Tokens that need to match within a Chunk can be preferable.
> For this configuration the FST linking engine will use the "Min Matched 
> Tokens (enhancer.engines.linking.minFoundTokens)" property of the 
> EntityLinker configuration. The default will be "2".
> The FST linking Engine will accept tokens the either confirm with 
> "enhancer.engines.linking.minChunkMatchScore" or 
> "enhancer.engines.linking.minFoundTokens".
> NOTE: those configuration do only apply for Tokens within a processable Chunk 
> (typically a Noun Phrase)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to