[ 
https://issues.apache.org/jira/browse/OPENNLP-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693659#comment-17693659
 ] 

Martin Wiesner edited comment on OPENNLP-1229 at 2/26/23 3:40 PM:
------------------------------------------------------------------

Actually, "{*}thi{*}" is the expected outcome for "this" for every 
{{PorterStemmer}} implementation and thus not a real "bug". For context, see 
FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer 
|https://tartarus.org/martin/PorterStemmer/].

You can cross check it with [NLTK|http://text-processing.com/demo/stem/]. It 
yields the same "thi" as stem of "this".

Hint: {{SnowballStemmer}} stems "this" -> "this".


was (Author: mawiesne):
Actually, "*thi*" is the expected outcome for "this" for every 
{{PorterStemmer}} implementation and thus not a real "bug". For context, see 
FAQ No. 3 at [https://tartarus.org/martin/PorterStemmer 
|https://tartarus.org/martin/PorterStemmer/].

You can cross-check this with [NLTK|http://text-processing.com/demo/stem/]. It 
yields the same "thi" as stem of "this".

Hint: {{SnowballStemmer}} stems "this" -> "this".

> stem function giving wrong output
> ---------------------------------
>
>                 Key: OPENNLP-1229
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1229
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Stemmer
>         Environment: Ubuntu-18.04, JDK-8
>            Reporter: Divya Rani
>            Assignee: Martin Wiesner
>            Priority: Minor
>
> As opennlp is using PorterStemmer for stemming PorterStemmer seems to be 
> stemming "this" -> "thi".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to