[
https://issues.apache.org/jira/browse/STANBOL-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653650#comment-13653650
]
Rupert Westenthaler commented on STANBOL-1053:
----------------------------------------------
The solution discussed in the above comment is not feasible, because the
tokenizer would have unpredictable consequences in case of URIs that do contain
spaces.
As next step I will further test if MLT even requires a tokenizer to be
configured to correctly process the parsed Content Stream. If this is the case
one needs to think of different solutions (such as to perform URLEncoding) for
all URLs parsed to the SolrYard)
> Add disambiguation context fields to the default Solr schema of the Entityhub
> SolrYard
> --------------------------------------------------------------------------------------
>
> Key: STANBOL-1053
> URL: https://issues.apache.org/jira/browse/STANBOL-1053
> Project: Stanbol
> Issue Type: Sub-task
> Components: Enhancement Engines, Entityhub
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
>
> Currently the Disambiguation engine uses the full text search field for
> disambiguation. With the addition of explicit Solr fields that are configured
> for MLT queries to the default schema.xml used by the Entityhub it would be
> possible to use a more specialized field for disambiguation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira