[ 
https://issues.apache.org/jira/browse/OPENNLP-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790462#comment-17790462
 ] 

Jon Marius Venstad commented on OPENNLP-1520:
---------------------------------------------

Hmm, obtaining that sample wouldn't be easy, since the illegal access exception 
is caught inside the opennlp library, and simply logged, as in the description. 
There's no way to find when this happens, and with what input, without 
modifying the library, apart from turning on external logging of input, and 
correlating with this log message, which I think is unacceptable. I'd be happy 
to check logs for occurrences of this error before and after an attempted fix, 
though. 

> Generated Java code for stemmers is broken, and should be re-generated
> ----------------------------------------------------------------------
>
>                 Key: OPENNLP-1520
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1520
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Stemmer
>    Affects Versions: 2.3.0, 2.3.1
>            Reporter: Jon Marius Venstad
>            Assignee: Martin Wiesner
>            Priority: Major
>             Fix For: 2.3.2
>
>
> The recursive stemming, which seems hard to actually trigger, but which is 
> the intended usage of the {{methodObject and method}} in the {{Among}} class 
> (called reflectively) is completely broken. First off, it tries to invoke a 
> private method from outside the class (from a parent class, the 
> {{{}SnowballProgram{}}}), which fails with an illegal access exception; if 
> that worked, it would also have invoked _all_ such method calls on the 
> {_}same, shared, static object{_}—not on the relevant stemmer instance. 
> This was fixed 8 years ago, but it looks like the generated code in the 
> opennlp-tools is 10 years old. I would urge you to re-generate that code. 
>  
> Commit that fixed the Java code generation: 
> [https://github.com/snowballstem/snowball/commit/0f9d3d64ab965447a7f638b8ededc924f3efca75]
>  
> Relevant sample stemmer with broken Java:
> [https://github.com/apache/opennlp/blob/main/opennlp-tools/src/main/java/opennlp/tools/stemmer/snowball/finnishStemmer.java]
>  
> Stack trace showing illegal reflection access:
>  
> {noformat}
> 2023-10-26 23:21:44.200 class opennlp.tools.stemmer.snowball.SnowballProgram 
> cannot access a member of class opennlp.tools.stemmer.snowball.finnishStemmer 
> with modifiers "private" 
> exception=java.lang.IllegalAccessException: class 
> opennlp.tools.stemmer.snowball.SnowballProgram cannot access a member of 
> class opennlp.tools.stemmer.snowball.finnishStemmer with modifiers "private"
>   at 
> java.base/jdk.internal.reflect.Reflection.newIllegalAccessException(Reflection.java:392)
>  
>   at 
> java.base/java.lang.reflect.AccessibleObject.checkAccess(AccessibleObject.java:674)
>  
>   at java.base/java.lang.reflect.Method.invoke(Method.java:560) 
>   at 
> opennlp.tools.stemmer.snowball.SnowballProgram.find_among_b(SnowballProgram.java:353)
>  
>   at 
> opennlp.tools.stemmer.snowball.finnishStemmer.r_case_ending(finnishStemmer.java:480)
>  
>   at 
> opennlp.tools.stemmer.snowball.finnishStemmer.stem(finnishStemmer.java:1003) 
>   at 
> opennlp.tools.stemmer.snowball.SnowballStemmer.stem(SnowballStemmer.java:131) 
>   at 
> com.yahoo.language.opennlp.OpenNlpTokenizer.processToken(OpenNlpTokenizer.java:64)
>  
>   at 
> com.yahoo.language.opennlp.OpenNlpTokenizer.lambda$tokenize$0(OpenNlpTokenizer.java:54)
>  
>   at 
> com.yahoo.language.simple.SimpleTokenizer.tokenize(SimpleTokenizer.java:74) 
>   at 
> com.yahoo.language.opennlp.OpenNlpTokenizer.tokenize(OpenNlpTokenizer.java:54)
>  
>   at 
> com.yahoo.vespa.indexinglanguage.linguistics.LinguisticsAnnotator.annotate(LinguisticsAnnotator.java:76)
> ...{noformat}
>  
>  
> Best, Jon Marius Venstad, developer at [vespa.ai|http://vespa.ai/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to