[ https://issues.apache.org/jira/browse/JCR-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
fabrizio giustina updated JCR-2622: ----------------------------------- Summary: Index analizers that extends StandardAnalyzer need to implement reusableTokenStream() since jackrabbit 2.1 (was: Configured index analizers not working in jackrabbit 2.1 and 2.2) Looks like I spoke too soon, after a deeper analysis I found out the problem can be fixed in the analyzer class and doesn't require a fix in jackrabbit itself. The change in JCR-2505 actually broke index analyzers that don't implement the reusableTokenStream() method properly: any analyzer that extends org.apache.lucene.analysis.standard.StandardAnalyzer was working properly in jackrabbit 2.0 which was using the tokenStream() method only. But since jackrabbit 2.1 such analizers cannot rely on the superclass implementation of reusableTokenStream() and they have to implement such method properly. The correct solution is probably not to extends StandardAnalyzer anymore (the reusableTokenStream method is not ovveraidable due to the usage private fields) but to extend a plain org.apache.lucene.analysis.Analyzer and reimplement the tokenStream method from scratch. So the problem looks like a but in all the analyzers I was using, but in a part that has never been used by jackrabbit before the change in version 2.1... the issue can be closed > Index analizers that extends StandardAnalyzer need to implement > reusableTokenStream() since jackrabbit 2.1 > ---------------------------------------------------------------------------------------------------------- > > Key: JCR-2622 > URL: https://issues.apache.org/jira/browse/JCR-2622 > Project: Jackrabbit Content Repository > Issue Type: Bug > Affects Versions: 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0 > Reporter: fabrizio giustina > Priority: Critical > Attachments: JCR-2622-tests_and_patch.diff > > > I just tried migrating an existing project which was using jackrabbit 2.0.0 > to 2.1.0. > We have an index analyzer configured which filters accented chars: > {code} > public class ItalianSnowballAnalyzer extends StandardAnalyzer > { > @Override > public TokenStream tokenStream(String fieldName, Reader reader) > { > return new ISOLatin1AccentFilter(new > LowerCaseFilter((super.tokenStream(fieldName, reader)))); > } > } > {code} > The project has a good number of unit tests, an xml is loaded in a > memory-only jackrabbit repository and several queries are checked against > expected results. > After migrating to 2.1.0 none of the tests that relied on the Index analizer > work anymore, for example searching for "test" doesn't find anymore nodes > containing "tèst". > Upgrading to jackrabbit 2.1.0 is the only change done (no changes in the > configuration/code or other libraries at all). Rolling back to the 2.0.0 > dependency is enough to make all the tests working again. > I've checked the changes in 2.1 but I couldn't find any apparently related > change. Also note that I was already using the patch in JCR-2504 also before > (configuration loading works fine in the unpatched 2.1). Another point is > that the configured IndexAnalyzer still gets actually called during our tests > (checked in debug mode). > Any idea? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.