(opennlp) branch opennlp-2.x updated: OPENNLP-1745: SentenceDetector - Add Junit test for useTokenEnd = false (OpenNLP 2.x) (#809)

mawiesne Mon, 07 Jul 2025 09:37:19 -0700

This is an automated email from the ASF dual-hosted git repository.

mawiesne pushed a commit to branch opennlp-2.x
in repository https://gitbox.apache.org/repos/asf/opennlp.git



The following commit(s) were added to refs/heads/opennlp-2.x by this push:
     new 5fb0530d OPENNLP-1745: SentenceDetector - Add Junit test for 
useTokenEnd = false (OpenNLP 2.x) (#809)
5fb0530d is described below

commit 5fb0530d45d9377ecbcf2ca710375061a19404fa
Author: Martin Wiesner <[email protected]>
AuthorDate: Mon Jul 7 18:35:22 2025 +0200

    OPENNLP-1745: SentenceDetector - Add Junit test for useTokenEnd = false 
(OpenNLP 2.x) (#809)
    
    - adapts PR #792 for OpenNLP 2.x
---
 opennlp-docs/src/docbkx/sentdetect.xml             | 207 +++++++++++----------
 .../sentdetect/SentenceDetectorTrainerTool.java    |   8 +-
 .../tools/cmdline/sentdetect/TrainingParams.java   |   5 +
 .../sentdetect/SentenceDetectorMEGermanTest.java   |  79 ++++++--
 4 files changed, 173 insertions(+), 126 deletions(-)

diff --git a/opennlp-docs/src/docbkx/sentdetect.xml 
b/opennlp-docs/src/docbkx/sentdetect.xml
index 11b047d3..51861a33 100644
--- a/opennlp-docs/src/docbkx/sentdetect.xml
+++ b/opennlp-docs/src/docbkx/sentdetect.xml
@@ -1,7 +1,7 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
-"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd";[
-]>
+                               
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd";[
+                               ]>
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
@@ -28,99 +28,99 @@ under the License.
        <section id="tools.sentdetect.detection">
                <title>Sentence Detection</title>
                <para>
-               The OpenNLP Sentence Detector can detect that a punctuation 
character 
-               marks the end of a sentence or not. In this sense a sentence is 
defined 
-               as the longest white space trimmed character sequence between 
two punctuation
-               marks. The first and last sentence make an exception to this 
rule. The first 
-               non whitespace character is assumed to be the start of a 
sentence, and the
-               last non whitespace character is assumed to be a sentence end.
-               The sample text below should be segmented into its sentences.
-               <screen>
+                       The OpenNLP Sentence Detector can detect that a 
punctuation character
+                       marks the end of a sentence or not. In this sense a 
sentence is defined
+                       as the longest white space trimmed character sequence 
between two punctuation
+                       marks. The first and last sentence make an exception to 
this rule. The first
+                       non whitespace character is assumed to be the start of 
a sentence, and the
+                       last non whitespace character is assumed to be a 
sentence end.
+                       The sample text below should be segmented into its 
sentences.
+                       <screen>
                                <![CDATA[
 Pierre Vinken, 61 years old, will join the board as a nonexecutive director 
Nov. 29. Mr. Vinken is
 chairman of Elsevier N.V., the Dutch publishing group. Rudolph Agnew, 55 years
 old and former chairman of Consolidated Gold Fields PLC, was named a director 
of this
 British industrial conglomerate.]]>
-               </screen>
-               After detecting the sentence boundaries each sentence is 
written in its own line.
-               <screen>
+                       </screen>
+                       After detecting the sentence boundaries each sentence 
is written in its own line.
+                       <screen>
                                <![CDATA[
 Pierre Vinken, 61 years old, will join the board as a nonexecutive director 
Nov. 29.
 Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
 Rudolph Agnew, 55 years old and former chairman of Consolidated Gold Fields 
PLC,
     was named a director of this British industrial conglomerate.]]>
-               </screen>
-               Usually Sentence Detection is done before the text is tokenized 
and that's the way the pre-trained models on the website are trained,
-               but it is also possible to perform tokenization first and let 
the Sentence Detector process the already tokenized text.
-               The OpenNLP Sentence Detector cannot identify sentence 
boundaries based on the contents of the sentence. A prominent example is the 
first sentence in an article where the title is mistakenly identified to be the 
first part of the first sentence.
-               Most components in OpenNLP expect input which is segmented into 
sentences.
+                       </screen>
+                       Usually Sentence Detection is done before the text is 
tokenized and that's the way the pre-trained models on the website are trained,
+                       but it is also possible to perform tokenization first 
and let the Sentence Detector process the already tokenized text.
+                       The OpenNLP Sentence Detector cannot identify sentence 
boundaries based on the contents of the sentence. A prominent example is the 
first sentence in an article where the title is mistakenly identified to be the 
first part of the first sentence.
+                       Most components in OpenNLP expect input which is 
segmented into sentences.
                </para>
-               
+
                <section id="tools.sentdetect.detection.cmdline">
-               <title>Sentence Detection Tool</title>
-               <para>
-               The easiest way to try out the Sentence Detector is the command 
line tool. The tool is only intended for demonstration and testing.
-               Download the english sentence detector model and start the 
Sentence Detector Tool with this command:
-        <screen>
-        <![CDATA[
+                       <title>Sentence Detection Tool</title>
+                       <para>
+                               The easiest way to try out the Sentence 
Detector is the command line tool. The tool is only intended for demonstration 
and testing.
+                               Download the english sentence detector model 
and start the Sentence Detector Tool with this command:
+                               <screen>
+                                       <![CDATA[
 $ opennlp SentenceDetector opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin]]>
-               </screen>
-               Just copy the sample text from above to the console. The 
Sentence Detector will read it and echo one sentence per line to the console.
-               Usually the input is read from a file and the output is 
redirected to another file. This can be achieved with the following command.
-               <screen>
-                               <![CDATA[
+                               </screen>
+                               Just copy the sample text from above to the 
console. The Sentence Detector will read it and echo one sentence per line to 
the console.
+                               Usually the input is read from a file and the 
output is redirected to another file. This can be achieved with the following 
command.
+                               <screen>
+                                       <![CDATA[
 $ opennlp SentenceDetector opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin < 
input.txt > output.txt]]>
-               </screen>
-               For the english sentence model from the website the input text 
should not be tokenized.
-               </para>
+                               </screen>
+                               For the english sentence model from the website 
the input text should not be tokenized.
+                       </para>
                </section>
                <section id="tools.sentdetect.detection.api">
-               <title>Sentence Detection API</title>
-               <para>
-               The Sentence Detector can be easily integrated into an 
application via its API.
-               To instantiate the Sentence Detector the sentence model must be 
loaded first.
-               <programlisting language="java">
-                               <![CDATA[
+                       <title>Sentence Detection API</title>
+                       <para>
+                               The Sentence Detector can be easily integrated 
into an application via its API.
+                               To instantiate the Sentence Detector the 
sentence model must be loaded first.
+                               <programlisting language="java">
+                                       <![CDATA[
 try (InputStream modelIn = new 
FileInputStream("opennlp-en-ud-ewt-sentence-1.2-2.5.0.bin")) {
   SentenceModel model = new SentenceModel(modelIn);
 }]]>
-               </programlisting>
-               After the model is loaded the SentenceDetectorME can be 
instantiated.
-               <programlisting language="java">
-                               <![CDATA[
+                               </programlisting>
+                               After the model is loaded the 
SentenceDetectorME can be instantiated.
+                               <programlisting language="java">
+                                       <![CDATA[
 SentenceDetectorME sentenceDetector = new SentenceDetectorME(model);]]>
-               </programlisting>
-               The Sentence Detector can output an array of Strings, where 
each String is one sentence.
+                               </programlisting>
+                               The Sentence Detector can output an array of 
Strings, where each String is one sentence.
                                <programlisting language="java">
-                               <![CDATA[
+                                       <![CDATA[
 String[] sentences = sentenceDetector.sentDetect("  First sentence. Second 
sentence. ");]]>
-               </programlisting>
-               The result array now contains two entries. The first String is 
"First sentence." and the
-        second String is "Second sentence." The whitespace before, between and 
after the input String is removed.
-               The API also offers a method which simply returns the span of 
the sentence in the input string.
-               <programlisting language="java">
-                               <![CDATA[
+                               </programlisting>
+                               The result array now contains two entries. The 
first String is "First sentence." and the
+                               second String is "Second sentence." The 
whitespace before, between and after the input String is removed.
+                               The API also offers a method which simply 
returns the span of the sentence in the input string.
+                               <programlisting language="java">
+                                       <![CDATA[
 Span[] sentences = sentenceDetector.sentPosDetect("  First sentence. Second 
sentence. ");]]>
-               </programlisting>
-               The result array again contains two entries. The first span 
beings at index 2 and ends at
-            17. The second span begins at 18 and ends at 34. The utility 
method Span.getCoveredText can be used to create a substring which only covers 
the chars in the span.
-               </para>
+                               </programlisting>
+                               The result array again contains two entries. 
The first span beings at index 2 and ends at
+                               17. The second span begins at 18 and ends at 
34. The utility method Span.getCoveredText can be used to create a substring 
which only covers the chars in the span.
+                       </para>
                </section>
        </section>
        <section id="tools.sentdetect.training">
                <title>Sentence Detector Training</title>
                <para/>
                <section id="tools.sentdetect.training.tool">
-               <title>Training Tool</title>
-               <para>
-               OpenNLP has a command line tool which is used to train the 
models available from the model
-               download page on various corpora. The data must be converted to 
the OpenNLP Sentence Detector
-               training format. Which is one sentence per line. An empty line 
indicates a document boundary.
-               In case the document boundary is unknown, it's recommended to 
have an empty line every few ten
-               sentences. Exactly like the output in the sample above.
-               Usage of the tool:
-               <screen>
-                               <![CDATA[
+                       <title>Training Tool</title>
+                       <para>
+                               OpenNLP has a command line tool which is used 
to train the models available from the model
+                               download page on various corpora. The data must 
be converted to the OpenNLP Sentence Detector
+                               training format. Which is one sentence per 
line. An empty line indicates a document boundary.
+                               In case the document boundary is unknown, it's 
recommended to have an empty line every few ten
+                               sentences. Exactly like the output in the 
sample above.
+                               Usage of the tool:
+                               <screen>
+                                       <![CDATA[
 $ opennlp SentenceDetectorTrainer
 Usage: opennlp SentenceDetectorTrainer[.namefinder|.conllx|.pos] [-abbDict 
path] \
                [-params paramsFile] [-iterations num] [-cutoff num] -model 
modelFile \
@@ -142,17 +142,20 @@ Arguments description:
         -data sampleData
                 data to be used, usually a file name.
         -encoding charsetName
-                encoding for reading and writing text, if absent the system 
default is used.]]>
-       </screen>
-               To train an English sentence detector use the following command:
-        <screen>
-                               <![CDATA[
+                encoding for reading and writing text, if absent the system 
default is used.
+        -useTokenEnd boolean flag
+                set to false when the next sentence in the test dataset 
doesn't start with a blank space post completion of
+                the previous sentence. If absent, it is defaulted to true.]]>
+                               </screen>
+                               To train an English sentence detector use the 
following command:
+                               <screen>
+                                       <![CDATA[
 $ opennlp SentenceDetectorTrainer -model en-custom-sent.bin -lang en -data 
en-custom-sent.train -encoding UTF-8
                         ]]>
-        </screen>
-            It should produce the following output:
-            <screen>
-                <![CDATA[
+                               </screen>
+                               It should produce the following output:
+                               <screen>
+                                       <![CDATA[
 Indexing events using cutoff of 5
 
        Computing event counts...  done. 4883 events
@@ -184,28 +187,28 @@ Performing 100 iterations.
 Wrote sentence detector model.
 Path: en-custom-sent.bin
 ]]>
-               </screen>
-               </para>
+                               </screen>
+                       </para>
                </section>
                <section id="tools.sentdetect.training.api">
-               <title>Training API</title>
-               <para>
-               The Sentence Detector also offers an API to train a new 
sentence detection model.
-               Basically three steps are necessary to train it:
-               <itemizedlist>
-                               <listitem>
-                                       <para>The application must open a 
sample data stream</para>
-                               </listitem>
-                               <listitem>
-                                       <para>Call the SentenceDetectorME.train 
method</para>
-                               </listitem>
-                               <listitem>
-                                       <para>Save the SentenceModel to a file 
or directly use it</para>
-                               </listitem>
-                       </itemizedlist>
-                       The following sample code illustrates these steps:
-                                       <programlisting language="java">
-                               <![CDATA[
+                       <title>Training API</title>
+                       <para>
+                               The Sentence Detector also offers an API to 
train a new sentence detection model.
+                               Basically three steps are necessary to train it:
+                               <itemizedlist>
+                                       <listitem>
+                                               <para>The application must open 
a sample data stream</para>
+                                       </listitem>
+                                       <listitem>
+                                               <para>Call the 
SentenceDetectorME.train method</para>
+                                       </listitem>
+                                       <listitem>
+                                               <para>Save the SentenceModel to 
a file or directly use it</para>
+                                       </listitem>
+                               </itemizedlist>
+                               The following sample code illustrates these 
steps:
+                               <programlisting language="java">
+                                       <![CDATA[
 
 ObjectStream<String> lineStream =
   new PlainTextByLineStream(new MarkableFileInputStreamFactory(new 
File("en-custom-sent.train")), StandardCharsets.UTF_8);
@@ -220,8 +223,8 @@ try (ObjectStream<SentenceSample> sampleStream = new 
SentenceSampleStream(lineSt
 try (OutputStream modelOut = new BufferedOutputStream(new 
FileOutputStream(modelFile))) {
   model.serialize(modelOut);
 }]]>
-               </programlisting>
-               </para>
+                               </programlisting>
+                       </para>
                </section>
        </section>
        <section id="tools.sentdetect.eval">
@@ -231,9 +234,9 @@ try (OutputStream modelOut = new BufferedOutputStream(new 
FileOutputStream(model
                <section id="tools.sentdetect.eval.tool">
                        <title>Evaluation Tool</title>
                        <para>
-                The command shows how the evaluator tool can be run:
-                <screen>
-                               <![CDATA[
+                               The command shows how the evaluator tool can be 
run:
+                               <screen>
+                                       <![CDATA[
 $ opennlp SentenceDetectorEvaluator -model en-custom-sent.bin -data 
en-custom-sent.eval -encoding UTF-8
 
 Loading model ... done
@@ -242,8 +245,8 @@ Evaluating ... done
 Precision: 0.9465737514518002
 Recall: 0.9095982142857143
 F-Measure: 0.9277177006260672]]>
-                </screen>
-                The en-custom-sent.eval file has the same format as the 
training data.
+                               </screen>
+                               The en-custom-sent.eval file has the same 
format as the training data.
                        </para>
                </section>
        </section>
diff --git 
a/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/SentenceDetectorTrainerTool.java
 
b/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/SentenceDetectorTrainerTool.java
index 933895bf..85f9c656 100644
--- 
a/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/SentenceDetectorTrainerTool.java
+++ 
b/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/SentenceDetectorTrainerTool.java
@@ -38,7 +38,7 @@ import opennlp.tools.sentdetect.SentenceSampleStream;
 import opennlp.tools.util.model.ModelUtil;
 
 public final class SentenceDetectorTrainerTool
-    extends AbstractTrainerTool<SentenceSample, TrainerToolParams> {
+        extends AbstractTrainerTool<SentenceSample, TrainerToolParams> {
 
   interface TrainerToolParams extends TrainingParams, TrainingToolParams {
   }
@@ -83,7 +83,7 @@ public final class SentenceDetectorTrainerTool
     char[] eos = null;
     if (params.getEosChars() != null) {
       String eosString = SentenceSampleStream.replaceNewLineEscapeTags(
-          params.getEosChars());
+              params.getEosChars());
       eos = eosString.toCharArray();
     }
 
@@ -92,9 +92,9 @@ public final class SentenceDetectorTrainerTool
     try {
       Dictionary dict = loadDict(params.getAbbDict());
       SentenceDetectorFactory sdFactory = SentenceDetectorFactory.create(
-          params.getFactory(), params.getLang(), true, dict, eos);
+              params.getFactory(), params.getLang(), params.getUseTokenEnd(), 
dict, eos);
       model = SentenceDetectorME.train(params.getLang(), sampleStream,
-          sdFactory, mlParams);
+              sdFactory, mlParams);
     } catch (IOException e) {
       throw createTerminationIOException(e);
     }
diff --git 
a/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/TrainingParams.java
 
b/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/TrainingParams.java
index 476f929a..37cb7115 100644
--- 
a/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/TrainingParams.java
+++ 
b/opennlp-tools/src/main/java/opennlp/tools/cmdline/sentdetect/TrainingParams.java
@@ -44,4 +44,9 @@ interface TrainingParams extends BasicTrainingParams {
       description = "A sub-class of SentenceDetectorFactory where to get 
implementation and resources.")
   @OptionalParameter
   String getFactory();
+
+  @ParameterDescription(valueName = "useTokenEnd",
+      description = "A boolean parameter to detect the start index of the next 
sentence in the test data.")
+  @OptionalParameter(defaultValue = "true")
+  Boolean getUseTokenEnd();
 }
diff --git 
a/opennlp-tools/src/test/java/opennlp/tools/sentdetect/SentenceDetectorMEGermanTest.java
 
b/opennlp-tools/src/test/java/opennlp/tools/sentdetect/SentenceDetectorMEGermanTest.java
index a520ed27..97d15076 100644
--- 
a/opennlp-tools/src/test/java/opennlp/tools/sentdetect/SentenceDetectorMEGermanTest.java
+++ 
b/opennlp-tools/src/test/java/opennlp/tools/sentdetect/SentenceDetectorMEGermanTest.java
@@ -20,12 +20,16 @@ package opennlp.tools.sentdetect;
 import java.io.IOException;
 import java.util.Locale;
 
-import org.junit.jupiter.api.Assertions;
 import org.junit.jupiter.api.BeforeAll;
 import org.junit.jupiter.api.Test;
 
 import opennlp.tools.dictionary.Dictionary;
 
+import static org.junit.jupiter.api.Assertions.assertAll;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.fail;
+
 /**
  * Tests for the {@link SentenceDetectorME} class.
  * <p>
@@ -42,22 +46,32 @@ import opennlp.tools.dictionary.Dictionary;
 public class SentenceDetectorMEGermanTest extends AbstractSentenceDetectorTest 
{
 
   private static final char[] EOS_CHARS = {'.', '?', '!'};
-  
-  private static SentenceModel sentdetectModel;
+  private static Dictionary abbreviationDict;
+  private SentenceModel sentdetectModel;
 
   @BeforeAll
-  public static void prepareResources() throws IOException {
-    Dictionary abbreviationDict = loadAbbDictionary(Locale.GERMAN);
-    SentenceDetectorFactory factory = new SentenceDetectorFactory(
-            "deu", true, abbreviationDict, EOS_CHARS);
-    sentdetectModel = train(factory, Locale.GERMAN);
-    Assertions.assertNotNull(sentdetectModel);
-    Assertions.assertEquals("deu", sentdetectModel.getLanguage());
+  static void loadResources() throws IOException {
+    abbreviationDict = loadAbbDictionary(Locale.GERMAN);
+  }
+
+  private void prepareResources(boolean useTokenEnd) {
+    try {
+      SentenceDetectorFactory factory = new SentenceDetectorFactory(
+              "deu", useTokenEnd, abbreviationDict, EOS_CHARS);
+      sentdetectModel = train(factory, Locale.GERMAN);
+
+      assertAll(() -> assertNotNull(sentdetectModel),
+              () -> assertEquals("deu", sentdetectModel.getLanguage()));
+    } catch (IOException ex) {
+      fail("Couldn't train the SentenceModel using test data. Exception: " + 
ex.getMessage());
+    }
   }
 
   // Example taken from 'Sentences_DE.txt'
   @Test
   void testSentDetectWithInlineAbbreviationsEx1() {
+    prepareResources(true);
+
     final String sent1 = "Ein Traum, zu dessen Bildung eine besonders starke 
Verdichtung beigetragen, " +
             "wird für diese Untersuchung das günstigste Material sein.";
     // Here we have two abbreviations "S. = Seite" and "ff. = folgende 
(Plural)"
@@ -66,40 +80,65 @@ public class SentenceDetectorMEGermanTest extends 
AbstractSentenceDetectorTest {
     SentenceDetectorME sentDetect = new SentenceDetectorME(sentdetectModel);
     String sampleSentences = sent1 + " " + sent2;
     String[] sents = sentDetect.sentDetect(sampleSentences);
-    Assertions.assertEquals(2, sents.length);
-    Assertions.assertEquals(sent1, sents[0]);
-    Assertions.assertEquals(sent2, sents[1]);
     double[] probs = sentDetect.getSentenceProbabilities();
-    Assertions.assertEquals(2, probs.length);
+
+    assertAll(() -> assertEquals(2, sents.length),
+            () -> assertEquals(sent1, sents[0]),
+            () -> assertEquals(sent2, sents[1]),
+            () -> assertEquals(2, probs.length));
   }
 
   // Reduced example taken from 'Sentences_DE.txt'
   @Test
   void testSentDetectWithInlineAbbreviationsEx2() {
+    prepareResources(true);
+
     // Here we have three abbreviations: "S. = Seite", "vgl. = vergleiche", 
and "f. = folgende (Singular)"
     final String sent1 = "Die farbige Tafel, die ich aufschlage, " +
             "geht (vgl. die Analyse S. 185 f.) auf ein neues Thema ein.";
 
     SentenceDetectorME sentDetect = new SentenceDetectorME(sentdetectModel);
     String[] sents = sentDetect.sentDetect(sent1);
-    Assertions.assertEquals(1, sents.length);
-    Assertions.assertEquals(sent1, sents[0]);
     double[] probs = sentDetect.getSentenceProbabilities();
-    Assertions.assertEquals(1, probs.length);
+
+    assertAll(() -> assertEquals(1, sents.length),
+            () -> assertEquals(sent1, sents[0]),
+            () -> assertEquals(1, probs.length));
   }
 
   // Modified example deduced from 'Sentences_DE.txt'
   @Test
   void testSentDetectWithInlineAbbreviationsEx3() {
+    prepareResources(true);
+
     // Here we have two abbreviations "z. B. = zum Beispiel" and "S. = Seite"
     final String sent1 = "Die farbige Tafel, die ich aufschlage, " +
             "geht (z. B. die Analyse S. 185) auf ein neues Thema ein.";
 
     SentenceDetectorME sentDetect = new SentenceDetectorME(sentdetectModel);
     String[] sents = sentDetect.sentDetect(sent1);
-    Assertions.assertEquals(1, sents.length);
-    Assertions.assertEquals(sent1, sents[0]);
     double[] probs = sentDetect.getSentenceProbabilities();
-    Assertions.assertEquals(1, probs.length);
+
+    assertAll(() -> assertEquals(1, sents.length),
+            () -> assertEquals(sent1, sents[0]),
+            () -> assertEquals(1, probs.length));
+  }
+
+  @Test
+  void testSentDetectWithUseTokenEndFalse() {
+    prepareResources(false);
+
+    final String sent1 = "Träume sind eine Verbindung von Gedanken.";
+    final String sent2 = "Verschiedene Gedanken sind während der 
Traumformation aktiv.";
+
+    SentenceDetectorME sentDetect = new SentenceDetectorME(sentdetectModel);
+    //There is no blank space before start of the second sentence.
+    String[] sents = sentDetect.sentDetect(sent1 + sent2);
+    double[] probs = sentDetect.getSentenceProbabilities();
+
+    assertAll(() -> assertEquals(2, sents.length),
+            () -> assertEquals(sent1, sents[0]),
+            () -> assertEquals(sent2, sents[1]),
+            () -> assertEquals(2, probs.length));
   }
 }

(opennlp) branch opennlp-2.x updated: OPENNLP-1745: SentenceDetector - Add Junit test for useTokenEnd = false (OpenNLP 2.x) (#809)

Reply via email to