Hello, it seems to be a bug which only appears if the last sentence is not terminated with an end-of-sentence char, can you confirm that ? See if its correct if you attach a dot to your last sentence.
Would be nice if you can open a bug report. Jörn On Wed, Dec 8, 2010 at 5:34 AM, Miao Chen <[email protected]> wrote: > Hello, > > I am working on a NLP project and using OpenNLP for preprocessing. When > using OpenNLP sentence detector to get sentences out of text, we'd like to > get some confidence score of the detected sentences, namely the probability > of an identified sentence being a real sentence. I found from the API there > is a method called getSentenceProbabilities() in the SentenceDetectorME > class, which seems to provide the sentence probability I need. However I > got > a problem when using this method and I will illustrate using an example. My > input text for sentence detector is: > "2.1.4 About JAVA package" > > The sentence detector returns me two sentences: "2.1.4" and "About JAVA > package" (which I can accept :) > But the getSentenceProbabilities() returns me only one probability (which > is > 0.9924560093213692), and I don't know which "sentence" this probability is > for. This method is supposed to return an array of probabilities, but in > this case only one probability is returned. > > Can anyone tell me how to explain this or how to get probabilities of the > identified sentences? I'd really appreciate any of your help. Thank you so > much! > > Best, > Miao > > -- > Miao Chen > > Ph.D. Candidate > School of Information Studies > Syracuse University >
