[
https://issues.apache.org/jira/browse/OPENNLP-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804580#comment-17804580
]
Martin Wiesner commented on OPENNLP-1543:
-----------------------------------------
Helpful resources:
*
https://www.atozserwisplus.com/blog/Do-you-know-these-Polish-acronyms-and-abbreviations
* https://polishforums.com/language/abbreviations-days-18037/
* https://web.library.yale.edu/cataloging/months
> Add Polish abbreviation dictionary
> ----------------------------------
>
> Key: OPENNLP-1543
> URL: https://issues.apache.org/jira/browse/OPENNLP-1543
> Project: OpenNLP
> Issue Type: Improvement
> Components: Sentence Detector, Tokenizer
> Affects Versions: 2.3.1
> Reporter: Martin Wiesner
> Assignee: Martin Wiesner
> Priority: Minor
> Fix For: 2.3.2
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Similar to the addition in OPENNLP-570 and OPENNLP-1531, an abbreviation
> dictionary for Polish sentence detection and tokenization might be beneficial.
> Aims:
> - Create and add a new file abb_PL.xml to opennlp-tools/lang/pl
> - Add basic set of test cases
--
This message was sent by Atlassian Jira
(v8.20.10#820010)