Hello, I am working on a NLP project and using OpenNLP for preprocessing. When using OpenNLP sentence detector to get sentences out of text, we'd like to get some confidence score of the detected sentences, namely the probability of an identified sentence being a real sentence. I found from the API there is a method called getSentenceProbabilities() in the SentenceDetectorME class, which seems to provide the sentence probability I need. However I got a problem when using this method and I will illustrate using an example. My input text for sentence detector is: "2.1.4 About JAVA package"
The sentence detector returns me two sentences: "2.1.4" and "About JAVA package" (which I can accept :) But the getSentenceProbabilities() returns me only one probability (which is 0.9924560093213692), and I don't know which "sentence" this probability is for. This method is supposed to return an array of probabilities, but in this case only one probability is returned. Can anyone tell me how to explain this or how to get probabilities of the identified sentences? I'd really appreciate any of your help. Thank you so much! Best, Miao -- Miao Chen Ph.D. Candidate School of Information Studies Syracuse University
