Rupert Westenthaler created STANBOL-1282:
--------------------------------------------

             Summary: classify Pos "Gerund" as matchable for EntityLinking
                 Key: STANBOL-1282
                 URL: https://issues.apache.org/jira/browse/STANBOL-1282
             Project: Stanbol
          Issue Type: Improvement
          Components: Enhancement Engines
    Affects Versions: 0.12.0, 1.0.0
            Reporter: Rupert Westenthaler
            Assignee: Rupert Westenthaler
            Priority: Minor
             Fix For: 1.0.0, 0.12.1


While "Gerund" [1] are verbs it is often the case that POS tagger to apply this 
tag also to other words ending with "-ing"

A typical example is "living" that can be used as verb, but also as noun such 
in the sentence "A report about living conditions in South Africa".

Now assuming we have "living conditions" in a vocabulary and all Nouns are 
linked (typical configuration for linking against thesauri). "living 
conditions" would not be found as "living" tagged as Gerund (verb) is not 
considered for matching and "conditions" alone will not match "living 
conditions". Adding Gerund as matchable Pos will cause the linking engine to 
correctly link "living conditions" as the the linkable token "conditions" will 
trigger a lookup and "living" will be considered for matching.

Real verbal usages of words tagged as Gerund will not cause lookups as they 
will not appear together with a linkable token and classifying them as 
matchable does not trigger vocabulary lookups.

To summarize:

While this is a workaround for POS taggers tending to tag all words ending with 
"-ing" as Gerund even if they are nouns it generally improves matching results 
without negative side effects.

[1] http://en.wikipedia.org/wiki/Gerund



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to