[ 
https://issues.apache.org/jira/browse/JCR-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976424#action_12976424
 ] 

fabrizio giustina commented on JCR-2622:
----------------------------------------

I tried to trace back the change that broke index analizers in jackrabbit > 
2.0.x and it turned out to be the optimization from JCR-2505 (rev. 915718).

The  patch in JCR-2505 added a "reusableTokenStream" method in 
org.apache.jackrabbit.core.query.lucene.JackrabbitAnalyzer which seemed to 
speed up tests, but it actually breaks them :/
Not sure if such optimization should be either needed, since the abstract base 
org.apache.lucene.analysis.Analyzer class already implements 
reusableTokenStream as an alias for the default tokenStream method.

The attached patch fixes the bug in the trunk release by simply removing the 
"optimization". The patch also contains a testcase that shows the issue.

Please, can anybody commit the patch to trunk and to the 2.1/2.2 branches?


> Configured index analizer doesn't really work in 2.1.0?
> -------------------------------------------------------
>
>                 Key: JCR-2622
>                 URL: https://issues.apache.org/jira/browse/JCR-2622
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0
>            Reporter: fabrizio giustina
>            Priority: Critical
>         Attachments: JCR-2622-tests_and_patch.diff
>
>
> I just tried migrating an existing project which was using jackrabbit 2.0.0 
> to 2.1.0.
> We have an index analyzer configured which filters accented chars: 
> {code}
> public class ItalianSnowballAnalyzer extends StandardAnalyzer
> {
>     @Override
>     public TokenStream tokenStream(String fieldName, Reader reader)
>     {
>         return new ISOLatin1AccentFilter(new 
> LowerCaseFilter((super.tokenStream(fieldName, reader))));
>     }
> }
> {code}
> The project has a good number of unit tests, an xml is loaded in a 
> memory-only jackrabbit repository and several queries are checked against 
> expected results.
> After migrating to 2.1.0 none of the tests that relied on the Index analizer 
> work anymore, for example searching for "test" doesn't find anymore nodes 
> containing "tèst".
> Upgrading to jackrabbit 2.1.0 is the only change done (no changes in the 
> configuration/code or other libraries at all). Rolling back to the 2.0.0 
> dependency is enough to make all the tests working again.
> I've checked the changes in 2.1 but I couldn't find any apparently related 
> change. Also note that I was already using the patch in JCR-2504 also before 
> (configuration loading works fine in the unpatched 2.1). Another point is 
> that the configured IndexAnalyzer still gets actually called during our tests 
> (checked in debug mode).
> Any idea?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to