[
https://issues.apache.org/jira/browse/OPENNLP-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nishant Shrivastava updated OPENNLP-1736:
-----------------------------------------
Priority: Minor (was: Major)
> NGramLanguageModel - Allow choice of smoothing/discounting algorithm
> --------------------------------------------------------------------
>
> Key: OPENNLP-1736
> URL: https://issues.apache.org/jira/browse/OPENNLP-1736
> Project: OpenNLP
> Issue Type: Improvement
> Components: language model
> Affects Versions: 2.5.4
> Reporter: Nishant Shrivastava
> Priority: Minor
>
> Currently, NGramLanguageModel uses stupid backoff to deal with “zero
> probability n-grams”. https://issues.apache.org/jira/browse/OPENNLP-986
> It will be useful, if we can refactor it to pass a smoothing/discounting
> logic from outside.
> This will allow us to add implementations of other smoothing/discounting
> techniques (e.g. below) in future.
> https://en.wikipedia.org/wiki/Kneser%E2%80%93Ney_smoothing
> https://en.wikipedia.org/wiki/Good%E2%80%93Turing_frequency_estimation
--
This message was sent by Atlassian Jira
(v8.20.10#820010)