Hi Brian,

> On Sep 1, 2019, at 7:17 AM, Brian McBride <brian.mcbr...@epimorphics.com> 
> wrote:
> 
> It used to be the case that JenaText supported querying of a Lucene text 
> index where the index was created independently of Jena and then made 
> available to JenaText via the dataset configuration.  Is this still the case?

That should still be the case, with the proviso that currently the fields names 
be handled via RDF properties outside the query string.

As you noted, it has been documented since 3.6.0 
<https://jena.apache.org/documentation/query/text-query.html> that:

> No explicit use of Fields within the query string is supported.

This is based on the assumption that the indexes contain only a single property 
field in the documents as they are indexed and hence only a single field 
corresponding to an RDF property in a query. Evidently a poor assumption not 
caught until now.


> Up until Jena 3.9.0 definitely, and I suspect 3.12.0 - I have not confirmed 
> this yet, it was possible to express text queries with field names and they 
> worked.

You’re correct, the change was introduced 
<https://github.com/apache/jena/blob/519c129ab2dfcb5eb43f1a337c618a8e69f88acd/jena-text/src/main/java/org/apache/jena/query/text/TextIndexLucene.java#L744>
 in the 3.13.0 code that breaks the previous behavior. I’m not able to explore 
fixing this for the next three weeks but may take a look at “fixing” this then. 
The basic change would be to replace the referenced line by:

    qstring = qs;

and that should be it. The results handling ( in simpleResults 
<https://github.com/apache/jena/blob/519c129ab2dfcb5eb43f1a337c618a8e69f88acd/jena-text/src/main/java/org/apache/jena/query/text/TextIndexLucene.java#L562>
 and highlightResults 
<https://github.com/apache/jena/blob/519c129ab2dfcb5eb43f1a337c618a8e69f88acd/jena-text/src/main/java/org/apache/jena/query/text/TextIndexLucene.java#L668>)
  should need no changes since Lucene:

    doc.get(null) 

just returns null  which is already handled. Evidently your application doesn’t 
use the

     (?s ?score ?lit) text:query … 

form, since there’s no information about what fields have been used in the 
queryString no bindings for ?lit can be made.

> We needed an index where multiple properties of the same resource were 
> indexed as a single document.  I would be happy to discuss this further - why 
> the solution indicated in the JenaText documentation didn't work for us and 
> whether there is way to construct a general purpose JenaText solution that 
> would. 


More explanation would be interesting.

Sorry for the inconvenience,
Chris

Reply via email to