[
https://issues.apache.org/jira/browse/LUCENE-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698504#comment-14698504
]
Robert Muir commented on LUCENE-6742:
-------------------------------------
We should also send this fix as a PR to snowball, I think its important since
it fixes problems with correctness. If you don't have time, let me know and I
can try to do it.
> Fix SnowballFilter for Lovins & Finnish (and others that use reflection)
> ------------------------------------------------------------------------
>
> Key: LUCENE-6742
> URL: https://issues.apache.org/jira/browse/LUCENE-6742
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/analysis
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6742.patch, LUCENE-6742.patch
>
>
> While reviewing the warnings (LUCENE-6740) I noticed the following:
> The LovinsStemmer and the FinnishStemmer from the snowball package cannot
> work at all. Indeed Robert Muir added a comment to the test referring to:
> http://article.gmane.org/gmane.comp.search.snowball/1139
> The bug is the following: The Among class looks up the method to call via
> reflection. But the stemmer compiler creates all these methods as private. As
> AccessibleObject#setAccessible() is called nowhere (we also don't want
> this), the SnowballProgram main class cannot call the method. Unfortunately
> it completely supresses all reflection errors and returns false.
> SnowballProgram then thinks, the called {{private boolean r_A()}} method
> returned false and so the whole snowball algorithm breaks.
> Also the snowball stemmer classes had a bug: methodObject is a static
> instance of the Stemmer, and Among calls the private method then on wrong
> object. So when fixing the above, this would fail again (because the stemmer
> changes a static singleton object, not itsself!). We also need to change the
> object in SnowballProgram to use "this" when calling the method.
> There are several possibilities to solve the private methods problem:
> - call {{AccessibleObject.setAccessible(true)}} -> we don't want this! Never
> Ever! :-)
> - patch the whole Snowball-generated classes to make those methods "public".
> This is a huge patch and shows many internals => no way
> - Since Java 7 we can use a cool trick, my favourite: MethodHandles
> The MethodHandles trick can be done the following way:
> - Change the Among class to get a {{MethodHandles.Lookup}} in its ctor
> (instead of the broken static {{SnowballProgram}} instance {{methodObject}}).
> This lookup is used to lookup the method by name. This Lookup Object must be
> private to our implementing class (the one that defines the private methods).
> If this is the case, we can get a method handle and call the method (from
> anywhere, because we "own" it).
> - replace the (already broken) {{methodObject}} in every stemmer to instead
> be the correct {{MethodHandles.Lookup}} of the implementing class containing
> the private method. This has 2 effects: The class to get the lookup gets all
> access rights to the class calling it (so we can access private methods) and
> also sees all methods.
> The second part is done by a new Ant task: {{ant patch-snowball}}. Whenever
> you add a new Snowball stemmer, copy the java file generated by the snowball
> compiler to the ext directory and call the ant task. It will patch the
> {{methodObject}} declaration (and also add a {{@SuppressWarnings("unused")}}
> for convenience).
> I reenabled the dictionary tests and the whole stemmer works. There are also
> no issues with security managers, because we do nothing security sensitive
> (all is our own stuff, no setAccessible).
> And finally: LovinsStemmer and FinnishStemmer got insanely fast (and correct,
> of course).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]