[ 
https://issues.apache.org/jira/browse/OPENNLP-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Zowalla reassigned OPENNLP-1810:
----------------------------------------

    Assignee: Richard Zowalla

> SentenceDetector fails to detect multiple identical abbreviations in the same 
> sentence
> --------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-1810
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1810
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: 2.5.7, 3.0.0-M1
>            Reporter: Richard Zowalla
>            Assignee: Richard Zowalla
>            Priority: Major
>             Fix For: 2.5.8, 3.0.0-M2
>
>
> This test cases shows the problem:
> {code:java}
> // Edge case: The same abbreviation appears twice in a single sentence 
> segment.
> @Test
> void testSentDetectWithDuplicateAbbreviationInSameSegment() {
>   prepareResources(true);
>   final String sent1 = "Lt. Vertrag und lt. Bescheid gelten andere 
> Bedingungen.";
>   String[] sents = sentenceDetector.sentDetect(sent1);
>   double[] probs = sentenceDetector.probs();
>   assertAll(
>    () -> assertEquals(1, sents.length),
>    () -> assertEquals(sent1, sents[0]),
>    () -> assertEquals(1, probs.length));
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to