Nishant Shrivastava created OPENNLP-1745:
--------------------------------------------

             Summary: SentenceDetector - Add documentation and tests for 
useTokenEnd = false
                 Key: OPENNLP-1745
                 URL: https://issues.apache.org/jira/browse/OPENNLP-1745
             Project: OpenNLP
          Issue Type: Test
          Components: Sentence Detector
            Reporter: Nishant Shrivastava


SentenceDetector can be configured with useTokenEnd = false.


This allows correct identification of the first character of the next sentence 
for cases where the new sentence doesn't start with a blank space (post the end 
of the previous sentence).

e.g. German Text - 

*Input :*
"Träume sind eine Verbindung von Gedanken.Verschiedene Gedanken sind während 
der Traumformation aktiv."

*Actual Output from SentenceDetectorTool (setting useTokenEnd = false):*
  Sentence-1 : Träume sind eine Verbindung von Gedanken.
  Sentence-2 : Verschiedene Gedanken sind während der Traumformation aktiv.

 

It will be useful to add tests and documentation for the correct usage of 
'useTokenEnd' property.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to