Rupert Westenthaler created STANBOL-1178:
--------------------------------------------
Summary: Remove 'Link Upper Case Tokens without POS tags' options
fromt the EntityLinger
Key: STANBOL-1178
URL: https://issues.apache.org/jira/browse/STANBOL-1178
Project: Stanbol
Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
As stated in a comment of STANBOL-1049:
> As noted by Joseph M'Bimbi-Bene in
> http://markmail.org/message/erubqmhwytp7mxoa
>
> The property
>
> enhancer.engines.linking.linkOnlyUpperCaseTokensWithMissingPosTag
>
> interferes with the upper case parameter ('uc={NONE/MATCH/LINK}')
> supported by the Text Processing configuration.
>
> To avoid this it needs to be investigated if the functionality described by
> this
> issue can also be implemented by using the
> 'enhancer.engines.linking.minSearchTokenLength' property in combination >
> with the value of the 'uc' parameter of the text processing configuration.
Because of this the 'linkOnlyUpperCaseTokensWithMissingPosTag' option should be
removed and the existing ''uc' parameter should be changed to work similar to
'linkOnlyUpperCaseTokensWithMissingPosTag'.
This will change the 'uc' parameter to a boolean switch. If enabled it will
change upper case tokens from
* NONE -> MATCH
* MATCH -> LINK
The default configuration will be enabled for all languages other than Germans
(as in German all Nouns are written using upper case).
As this change will affect existing configurations it will only take place
after the upcoming 0.12.0 release of Apache Stanbol.
--
This message was sent by Atlassian JIRA
(v6.1#6144)