[ https://issues.apache.org/jira/browse/OPENNLP-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693659#comment-17693659 ]
Martin Wiesner edited comment on OPENNLP-1229 at 2/26/23 3:40 PM: ------------------------------------------------------------------ Actually, "{*}thi{*}" is the expected outcome for "this" for every {{PorterStemmer}} implementation and thus not a real "bug". For context, see FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer |https://tartarus.org/martin/PorterStemmer/]. You can cross check it with [NLTK|http://text-processing.com/demo/stem/]. It yields the same "thi" as stem of "this". Hint: {{SnowballStemmer}} stems "this" -> "this". was (Author: mawiesne): Actually, "*thi*" is the expected outcome for "this" for every {{PorterStemmer}} implementation and thus not a real "bug". For context, see FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer |https://tartarus.org/martin/PorterStemmer/]. You can cross-check this with [NLTK|http://text-processing.com/demo/stem/]. It yields the same "thi" as stem of "this". Hint: {{SnowballStemmer}} stems "this" -> "this". > stem function giving wrong output > --------------------------------- > > Key: OPENNLP-1229 > URL: https://issues.apache.org/jira/browse/OPENNLP-1229 > Project: OpenNLP > Issue Type: Bug > Components: Stemmer > Environment: Ubuntu-18.04, JDK-8 > Reporter: Divya Rani > Assignee: Martin Wiesner > Priority: Minor > > As opennlp is using PorterStemmer for stemming PorterStemmer seems to be > stemming "this" -> "thi". -- This message was sent by Atlassian Jira (v8.20.10#820010)