Hello, Seems, I'm doing something wrong, but documentation & forum isn't very helpful in my case. My goal is to add abbreviations to SentenceDetector, but I can't succeed. I'm trying to use this constructor overload:
public *SentenceModel*(String <http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html> languageCode, opennlp.model.AbstractModel sentModel, boolean useTokenEnd, Dictionary <http://opennlp.apache.org/documentation/1.5.3/apidocs/opennlp-tools/opennlp/tools/dictionary/Dictionary.html> abbreviations) and a trivial model from OpenNlp repository. Here is a code example (it's C# port via IKVM. Don't be confused) : var abbreviations = new Dictionary(); abbreviations.put(new StringList("corp.")); var modelPath = @"....\sent.model"; //path to file, extracted from "en-sent.bin" var dataStream = new DataInputStream(new FileInputStream(modelPath)); var sentenceModel = new BinaryGISModelReader(dataStream).getModel(); var abbreviatedSentenceModel = new SentenceModel("en", sentenceModel, true, abbreviations); ............................. var sentenceSplitter = new SentenceDetectorME( abbreviatedSentenceModel); sentenceSplitter.sentDetect(text); The result of it's execution is the same, as though there wouldn't be any abbreviations dictionary. So I suppose that either there should be any other way to do this, either it's a bug. Could you help, please. Thanks In Advance, Siarhei.
