[ 
https://issues.apache.org/jira/browse/OPENNLP-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805655#comment-17805655
 ] 

ASF GitHub Bot commented on OPENNLP-1543:
-----------------------------------------

kinow commented on code in PR #585:
URL: https://github.com/apache/opennlp/pull/585#discussion_r1449025396


##########
opennlp-tools/src/test/resources/opennlp/tools/sentdetect/origin-training-data.txt:
##########
@@ -23,6 +23,12 @@ available here:
 https://issues.apache.org/jira/browse/OPENNLP-1163
 ################
 ################
+Sentences_PL.txt:
+- Zygmunt Freuda - Interpretacja marzeń sennych
+- Rozdział VI: Praca nad snem | sekcja A "Praca nad kondensacją"
+automatically translated from the German version, see Sentences_DE.txt

Review Comment:
   Woah, great job! :clap: 





> Add Polish abbreviation dictionary
> ----------------------------------
>
>                 Key: OPENNLP-1543
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1543
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Sentence Detector, Tokenizer
>    Affects Versions: 2.3.1
>            Reporter: Martin Wiesner
>            Assignee: Martin Wiesner
>            Priority: Minor
>             Fix For: 2.3.2
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Similar to the addition in OPENNLP-570 and OPENNLP-1531, an abbreviation 
> dictionary for Polish sentence detection and tokenization might be beneficial.
> Aims:
> - Create and add a new file abb_PL.xml to opennlp-tools/lang/pl
> - Add basic set of test cases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to