[jira] [Commented] (STANBOL-1053) Add disambiguation context fields to the default Solr schema of the Entityhub SolrYard

Rupert Westenthaler (JIRA) Fri, 10 May 2013 02:05:18 -0700

    [ 
https://issues.apache.org/jira/browse/STANBOL-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653650#comment-13653650
 ]


Rupert Westenthaler commented on STANBOL-1053:
----------------------------------------------

The solution discussed in the above comment is not feasible, because the 
tokenizer would have unpredictable consequences in case of URIs that do contain 
spaces.

As next step I will further test if MLT even requires a tokenizer to be 
configured to correctly process the parsed Content Stream. If this is the case 
one needs to think of different solutions (such as to perform URLEncoding) for 
all URLs parsed to the SolrYard)
                
> Add disambiguation context fields to the default Solr schema of the Entityhub 
> SolrYard
> --------------------------------------------------------------------------------------
>
>                 Key: STANBOL-1053
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1053
>             Project: Stanbol
>          Issue Type: Sub-task
>          Components: Enhancement Engines, Entityhub
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> Currently the Disambiguation engine uses the full text search field for 
> disambiguation. With the addition of explicit Solr fields that are configured 
> for MLT queries to the default schema.xml used by the Entityhub it would be 
> possible to use a more specialized field for disambiguation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (STANBOL-1053) Add disambiguation context fields to the default Solr schema of the Entityhub SolrYard

Reply via email to