[ https://issues.apache.org/jira/browse/JENA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709241#comment-16709241 ]
Code Ferret commented on JENA-1645: ----------------------------------- It would be helpful to see example queries and how you have used the subject URI. I agree that the {{concreteSubject}} *should* create Lucene queries that include a term of the form: {code} ... AND uri:http://example.org/data/resource/R0123 {code} Currently the code for {{concreteSubject}} collects results for all possible subjects and then after the results are returned selects just the ones corresponding to the provided {{subject}} and discards the rest of the results. Quite inefficient! This behavior is transparent to the user other than the performance; however, if there is some reason to keep this behavior then the _new_ behavior can be handled by adding a {{boolean}} {{TextIndex}} option in the configuration: {{text:useConcreteSubject}}. The implementation involves threading the subject into the {{TextIndex.query(...)}}, adding a new query method to {{TextIndex}}, {{TextIndexLucene}} and {{TextIndexES}}. It should be rather straightforward. > Poor performance with full text search (Lucene) > ----------------------------------------------- > > Key: JENA-1645 > URL: https://issues.apache.org/jira/browse/JENA-1645 > Project: Apache Jena > Issue Type: Question > Components: Jena > Affects Versions: Jena 3.9.0 > Reporter: Vasyl Danyliuk > Priority: Major > > Situation: half of a million of an indexed by Lucene documents(emails > actually), searching for emails by sender/receiver and some text. > If to put text filter in the start of SPARQL query it executes once but in a > case of very common words here are a lot of results(100 000+) that leads to > poor performance, limiting results count may and up with missed results. > If to put text search as the last condition it executes once per each already > found subject. That's completely OK but text search completely ignores > subject URI. > I found two methods in TextQueryPF class: variableSubject(...) for the first > case, and concreteSubject(...) for the second one. > The question is: why can't subject URI be used as a constraint in the text > search? -- This message was sent by Atlassian JIRA (v7.6.3#76005)