Github user xristy commented on the issue:
https://github.com/apache/jena/pull/436
@kinow I think the configuration and results reflect the `text:searchFor`
functionality; however, in the analyzer defn for the tag `jp`:
text:analyzer [
a text:GenericAnalyzer ;
text:class "org.apache.lucene.analysis.ja.JapaneseAnalyzer" ;
text:tokenizer <#tokenizer> ;
]
the `text:tokenizer <#tokenizer> ;` is not effective. Tokenizer specs work
with `ConfigurableAnalyzer` and are ignored in `text:GenericAnalyzer`. Perhaps
a warning should be logged but that means checking for the presence of
unsupported predicates?
Re:
> the complexity put on TextIndexLucene. A few methods are getting a
boolean flag to change their behaviour. And when that happens too much,
sometimes it may feel like the method has two behaviours, and writing tests or
changing it may be challenging. Maybe it could extend it in some other way.
I'm not sure how to improve this. The flag in `highlightResults` affects
the value of the `effectiveField` in the context of a larger method, and the
flag in `getQueryAnalyzer` conditions whether any useful work is done or not. I
factored that as a method rather than leaving it inline in `query$` to reduce
the clutter in that principal routine.
Re:
> it's not a batteries-included feature, if I understand correctly. You
still need to prepare the other part of the solution, be it a tokenizer that
gets a value such as "kinou", then searches some dictionary, and finally create
tokens for :ex3 dc:title "æ¨æ¥" and "ãã®ã", or change the data a bit.
Maybe this could be a separate project, or an extension of sorts.
I'm not sure what you are recommending here. The `text:searchFor` and
`text:auxIndex` functionalities are ways of configuring the _application_ of
appropriate analyzers that have been separately defined. So yes the features
are not self-contained in that analyzers do have to be supplied.
---