Hi Balduin,
We use Jena-text for some projects at work, so I hope this might be helpful
since we have some familiarity with its usage.
I think a partial workaround could be adding the analyzer per field, as
described in the docs, where you use different analyzers for different
fields. So some properties use an exact match, while other fields use the
default analyzer. If the predicates can be grouped by analyzers this should
work, and you can optionally group the queries by using property lists.
Something like (pseudo config)
....
<#indexLucene> a text:TextIndexLucene ;
...
text:defineAnalyzers (
[ text:defineAnalyzer :whitespaceLowerCaseAsciiFoldingNGram ;
text:analyzer [
a text:ConfigurableAnalyzer ;
text:tokenizer text:WhitespaceTokenizer ;
text:filters ( text:ASCIIFoldingFilter text:LowerCaseFilter)
] ]
. . . ) .
entMap
a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "label" ;
text:map (
[ text:field "label" ; text:predicate rdfs:label ;
text:analyzer [
a text:DefinedAnalyzer
text:useAnalyzer :whitespaceLowerCaseAsciiFoldingNGram ]
],
[ text:field "identifier" ; text:predicate dct:identifier ;
text:analyzer [
a text:DefinedAnalyzer
text:useAnalyzer :whitespaceLowerCaseAsciiFoldingNGram ]
],
# uses default
[ text:field "prefLabel" ; text:predicate skos:prefLabel
])
...
If you want more fine-grained control, like different analyzers on the same
field, in different contexts, you could try to add the property where you
want multiple analyzers, with different field names and with different
analyzers defined.
Eg. [text:field "label1", text:predicate rdfs:label ;...], [text:field
"label2", text:predicate rdfs:label ;...]
Best case it queries both fields, but I don't think you can choose which
to use, since the text:query expects a predicate (and may end up
translating to query both fields. I'm not sure what will happen on indexing
and querying on such a setup.
I think maybe a feature request for Jena could be querying the fields
defined by their name in the index, so you can index a property multiple
times with different field names, using different analyzers.
Since jena-text currently queries using the predicates (which are expanded
to the fields), using a reserved namespace and field name could be an
addition, to how it currently works?
A suggestion could be something like text:query ( text:fieldName or
text:fieldFieldName or textProp:fieldName "test")? (where textProp is a new
namespace for text field properties).
Best regards,
Øyvind Gjesdal
On Fri, Dec 12, 2025 at 4:23 PM Balduin Landolt <[email protected]>
wrote:
> Hi list,
>
> we're having some "issues" with how our Fuseki-backed application does
> full-text search:
> We have configured the lucene index to use a configurable analyzer
> ( [ a text:ConfigurableAnalyzer ;
> text:tokenizer text:WhitespaceTokenizer ;
> text:filters ( text:ASCIIFoldingFilter text:LowerCaseFilter)
> ] . )
> because in some contexts we really need the whitespace tokenizer and ASCII
> folding.
> However, in other contexts, this provides really unsatisfying search
> behaviour; and we'd actually be much happier with the default analyzer.
>
> I've had a look at
> https://jena.apache.org/documentation/query/text-query.html - am I
> understanding correctly that it is not possible to have multiple Lucene
> indices on a single dataset within Fuseki?
> Is there maybe a strategy I'm missing, which might solve my problem, while
> not involving maintaining a separate search index, and having to keep my DB
> in sync with said index?
>
> Thanks in advance!
>
> Best,
> Balduin
>
> --
>
>
> *[image: Image]*
>
>
> *Balduin Landolt *| He / him | Lead Software Engineer
>
> Universität Basel | DaSCH - Swiss National Data & Service Center for the
> Humanities
> Kornhausgasse 7 | CH-4051 Basel | Switzerland |
>