[ https://issues.apache.org/jira/browse/OPENNLP-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Wiesner closed OPENNLP-571. ---------------------------------- Resolution: Abandoned > Tokenizer Model for german text > ------------------------------- > > Key: OPENNLP-571 > URL: https://issues.apache.org/jira/browse/OPENNLP-571 > Project: OpenNLP > Issue Type: Wish > Components: Tokenizer > Affects Versions: 1.6.0 > Reporter: Andreas Niekler > Priority: Minor > > I created a tokenizer model with proper deTokenisation rules for differnt > sorts of quotes. The model is based on 300.000 example sentences of the > german version from the leipzig corpora collection. I don't know if there > might be any copyright protection issue because those sentences are crawled > from the web. If the content of the model is not in a form that would enable > one to reconstruct the sentences everything is fine. Please comment on those > thougts. If everything is ok i will contribute the model for futher testing > by the openNLP Team. -- This message was sent by Atlassian Jira (v8.20.10#820010)