Nishant Shrivastava created OPENNLP-1745:
--------------------------------------------
Summary: SentenceDetector - Add documentation and tests for
useTokenEnd = false
Key: OPENNLP-1745
URL: https://issues.apache.org/jira/browse/OPENNLP-1745
Project: OpenNLP
Issue Type: Test
Components: Sentence Detector
Reporter: Nishant Shrivastava
SentenceDetector can be configured with useTokenEnd = false.
This allows correct identification of the first character of the next sentence
for cases where the new sentence doesn't start with a blank space (post the end
of the previous sentence).
e.g. German Text -
*Input :*
"Träume sind eine Verbindung von Gedanken.Verschiedene Gedanken sind während
der Traumformation aktiv."
*Actual Output from SentenceDetectorTool (setting useTokenEnd = false):*
Sentence-1 : Träume sind eine Verbindung von Gedanken.
Sentence-2 : Verschiedene Gedanken sind während der Traumformation aktiv.
It will be useful to add tests and documentation for the correct usage of
'useTokenEnd' property.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)