Sachin Kumar Saini created OPENNLP-1235:
-------------------------------------------
Summary: lemmatizer decodeLemmas method bug
Key: OPENNLP-1235
URL: https://issues.apache.org/jira/browse/OPENNLP-1235
Project: OpenNLP
Issue Type: Bug
Components: Lemmatizer
Affects Versions: 1.8.4, 1.9.0, 1.9.1
Reporter: Sachin Kumar Saini
Attachments: bugshot.png
I got the following permutaion for the text "Obstruction":
R10ojR9baR8smD7tD6rD5uD4cD3tD2iD1oD0n
When i checked the method decodeShortestEditScript in class
opennlp.tools.util.StringUti, which is used to decode the word with
permutations.
I got the permIndex variable initialized to 0 and increasing by one to get next
permutation letter,
which is gone wrong when the permutation letter has more than 1 digits e.g. in
the given permutation R10 mear R and 10, but because of permIndex increating by
one, so it is considering as R and 1.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)