Hello,

Seems, I'm doing something wrong, but documentation & forum isn't very
helpful in my case.
My goal is to add abbreviations to SentenceDetector, but I can't succeed.
I'm trying to use this constructor overload:

public *SentenceModel*(String
<http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html>
languageCode,
                     opennlp.model.AbstractModel sentModel,
                     boolean useTokenEnd, Dictionary
<http://opennlp.apache.org/documentation/1.5.3/apidocs/opennlp-tools/opennlp/tools/dictionary/Dictionary.html>
abbreviations)

and a trivial model from OpenNlp repository.

Here is a code example (it's C# port via IKVM. Don't be confused) :

var abbreviations = new Dictionary();
abbreviations.put(new StringList("corp."));

var modelPath = @"....\sent.model"; //path to file, extracted from
"en-sent.bin"
var dataStream = new DataInputStream(new FileInputStream(modelPath));
var sentenceModel = new BinaryGISModelReader(dataStream).getModel();
var abbreviatedSentenceModel = new SentenceModel("en", sentenceModel, true,
abbreviations);
                        .............................

                        var sentenceSplitter = new SentenceDetectorME(
abbreviatedSentenceModel);
sentenceSplitter.sentDetect(text);

The result of it's execution is the same, as though there wouldn't be any
abbreviations dictionary.
So I suppose that either there should be any other way to do this, either
it's a bug.
Could you help, please.

Thanks In Advance,
Siarhei.

Reply via email to