[
https://issues.apache.org/jira/browse/OPENNLP-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nishant Shrivastava updated OPENNLP-1745:
-----------------------------------------
Issue Type: Improvement (was: Test)
> SentenceDetector - Add documentation and tests for useTokenEnd = false
> ----------------------------------------------------------------------
>
> Key: OPENNLP-1745
> URL: https://issues.apache.org/jira/browse/OPENNLP-1745
> Project: OpenNLP
> Issue Type: Improvement
> Components: Sentence Detector
> Reporter: Nishant Shrivastava
> Priority: Trivial
>
> SentenceDetector can be configured with useTokenEnd = false.
> This allows appropriate identification of the first character of the next
> sentence for cases where the new sentence doesn't start with a blank space
> (post the end of the previous sentence).
> e.g. German Text -
> *Input :*
> "Träume sind eine Verbindung von Gedanken.Verschiedene Gedanken sind während
> der Traumformation aktiv."
> *Actual Output from SentenceDetectorTool (setting useTokenEnd = false):*
> Sentence-1 : Träume sind eine Verbindung von Gedanken.
> Sentence-2 : Verschiedene Gedanken sind während der Traumformation aktiv.
>
> It will be useful to add tests and documentation for the correct usage of
> 'useTokenEnd' property.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)