Nishant Shrivastava created OPENNLP-1736:
--------------------------------------------
Summary: NGramLanguageModel - Allow choice of
smoothing/discounting algorithm
Key: OPENNLP-1736
URL: https://issues.apache.org/jira/browse/OPENNLP-1736
Project: OpenNLP
Issue Type: Wish
Components: language model
Affects Versions: 2.5.4
Reporter: Nishant Shrivastava
Currently, NGramLanguageModel uses stupid backoff to deal with “zero
probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986
It will be useful, if we can refactor it to pass a smoothing/discounting logic
from outside.
This will allow us to add implementations of other smoothing/discounting
techniques (e.g. below) in future.
https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing
https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation
--
This message was sent by Atlassian Jira
(v8.20.10#820010)