[jira] [Updated] (OPENNLP-1745) SentenceDetector - Add documentation and tests for useTokenEnd = false

Nishant Shrivastava (Jira) Mon, 16 Jun 2025 11:45:21 -0700


     [ 
https://issues.apache.org/jira/browse/OPENNLP-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nishant Shrivastava updated OPENNLP-1745:
-----------------------------------------
    Issue Type: Improvement  (was: Test)

> SentenceDetector - Add documentation and tests for useTokenEnd = false
> ----------------------------------------------------------------------
>
>                 Key: OPENNLP-1745
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1745
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Sentence Detector
>            Reporter: Nishant Shrivastava
>            Priority: Trivial
>
> SentenceDetector can be configured with useTokenEnd = false.
> This allows appropriate identification of the first character of the next 
> sentence for cases where the new sentence doesn't start with a blank space 
> (post the end of the previous sentence).
> e.g. German Text - 
> *Input :*
> "Träume sind eine Verbindung von Gedanken.Verschiedene Gedanken sind während 
> der Traumformation aktiv."
> *Actual Output from SentenceDetectorTool (setting useTokenEnd = false):*
>   Sentence-1 : Träume sind eine Verbindung von Gedanken.
>   Sentence-2 : Verschiedene Gedanken sind während der Traumformation aktiv.
>  
> It will be useful to add tests and documentation for the correct usage of 
> 'useTokenEnd' property.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OPENNLP-1745) SentenceDetector - Add documentation and tests for useTokenEnd = false

Reply via email to