[
https://issues.apache.org/jira/browse/JENA-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620630#comment-15620630
]
Osma Suominen commented on JENA-1250:
-------------------------------------
I took a look at the test failure with 6.2.1. It seems that
AnalyzingQueryParser functionality has changed in Lucene 6.2.0/6.2.1 in a way
that breaks some of the jena-text custom analyzers. This was done in
LUCENE-7355 and this commit:
https://github.com/apache/lucene-solr/commit/7c2e7a0fb80a5bf733cf710aee6cbf01d02629eb
We can already upgrade to 6.1.0, you just need to fix the VER setting in
TextIndexLucene to a version that it knows about, e.g. 6_1_0 works fine.
I think that our code could be made compatible with Lucene 6.2.1 by adding
normalize methods to our custom analyzers, similar to how all the analyzers in
Lucene were patched in LUCENE-7355.
Regarding your comment about encoding language in the field name: this isn't
really about storing the language in the field name (the lang field is still
used for that), but about keeping the different language values - that have
been analyzed differently - in separate fields so it is impossible to mix them
up by mistake.
Personally I'm not super happy about the way I had to [patch
in|https://github.com/jmvanel/jena/commit/072afffbf24d68575e1b9cf886a5998739cb5ca9#diff-48122cd8c64b1e120b37581f50fc55ceR330]
the language tag to the field name in TextIndexLucene.parseQuery but currently
query strings with field names are generated outside TextIndexLucene (in
TextQueryPF) so it's almost too late to switch the field name from within
TextIndexLucene. The replaceFirst hack works but there may be edge cases that
are broken; in any case it's very hackish. I may try to come up with a
different solution but it would involve shifting responsibilities somehow
between those two classes.
> Upgrade text search to latest Lucene
> ------------------------------------
>
> Key: JENA-1250
> URL: https://issues.apache.org/jira/browse/JENA-1250
> Project: Apache Jena
> Issue Type: Improvement
> Components: Jena
> Reporter: Jean-Marc Vanel
>
> We are currently at Lucene 4.9.1 ,
> which is quite outdated compared to latest Lucene, which is 6.2.1 .
> Note that there is project to add a simple completion feature in addition to
> existing simple search.
> But it would be better to do that on an updated Lucene dependency .
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)