[ 
https://issues.apache.org/jira/browse/JENA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709812#comment-16709812
 ] 

Vasyl Danyliuk commented on JENA-1645:
--------------------------------------

The query is pretty straightforward:
{code:java}
PREFIX person: <http://person/>
PREFIX email: <http://email/>
PREFIX text: <http://jena.apache.org/text#>

SELECT DISTINCT ?emailId ?content
  WHERE {
    ?person1Id person:name "Person One" .
    ?person2Id person:name "Second Person" .
    {?person1Id email:sent ?emailId . ?person2Id email:received ?emailId .} 
UNION
    {?person2Id email:sent ?emailId . ?person1Id email:received ?emailId .}
    (?emailId ?score ?content) text:query (email:indexedContent "ext to search" 
1 "highlight:s:<em class='hiLite'> | e:</em>") .
  }
{code}
Such cases already covered by tests in jena-text module.

Created pull request with code added to the Lucene index.

> Poor performance with full text search (Lucene)
> -----------------------------------------------
>
>                 Key: JENA-1645
>                 URL: https://issues.apache.org/jira/browse/JENA-1645
>             Project: Apache Jena
>          Issue Type: Question
>          Components: Jena
>    Affects Versions: Jena 3.9.0
>            Reporter: Vasyl Danyliuk
>            Priority: Major
>
> Situation: half of a million of an indexed by Lucene documents(emails 
> actually), searching for emails by sender/receiver and some text.
> If to put text filter in the start of SPARQL query it executes once but in a 
> case of very common words here are a lot of results(100 000+) that leads to 
> poor performance, limiting results count may and up with missed results.
> If to put text search as the last condition it executes once per each already 
> found subject. That's completely OK but text search completely ignores 
> subject URI.
> I found two methods in TextQueryPF class: variableSubject(...) for the first 
> case, and concreteSubject(...) for the second one.
> The question is: why can't subject URI be used as a constraint in the text 
> search?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to