[ 
https://issues.apache.org/jira/browse/JENA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709241#comment-16709241
 ] 

Code Ferret commented on JENA-1645:
-----------------------------------

It would be helpful to see example queries and how you have used the subject 
URI. 

I agree that the {{concreteSubject}} *should* create Lucene queries that 
include a term of the form:

{code}
... AND uri:http://example.org/data/resource/R0123
{code}

Currently the code for {{concreteSubject}} collects results for all possible 
subjects and then after the results are returned selects just the ones 
corresponding to the provided {{subject}} and discards the rest of the results. 
Quite inefficient!

This behavior is transparent to the user other than the performance; however, 
if there is some reason to keep this behavior then the _new_ behavior can be 
handled by adding a {{boolean}} {{TextIndex}} option in the configuration: 
{{text:useConcreteSubject}}.

The implementation involves threading the subject into the 
{{TextIndex.query(...)}}, adding a new query method to {{TextIndex}}, 
{{TextIndexLucene}} and {{TextIndexES}}. It should be rather straightforward.

> Poor performance with full text search (Lucene)
> -----------------------------------------------
>
>                 Key: JENA-1645
>                 URL: https://issues.apache.org/jira/browse/JENA-1645
>             Project: Apache Jena
>          Issue Type: Question
>          Components: Jena
>    Affects Versions: Jena 3.9.0
>            Reporter: Vasyl Danyliuk
>            Priority: Major
>
> Situation: half of a million of an indexed by Lucene documents(emails 
> actually), searching for emails by sender/receiver and some text.
> If to put text filter in the start of SPARQL query it executes once but in a 
> case of very common words here are a lot of results(100 000+) that leads to 
> poor performance, limiting results count may and up with missed results.
> If to put text search as the last condition it executes once per each already 
> found subject. That's completely OK but text search completely ignores 
> subject URI.
> I found two methods in TextQueryPF class: variableSubject(...) for the first 
> case, and concreteSubject(...) for the second one.
> The question is: why can't subject URI be used as a constraint in the text 
> search?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to