Hi Chris,
precision (P) and recall (R) are well defined evaluation metrics and
apply to various statistical evaluations including sentence-detection...
but there is nothing special about sentence-detection. If you understand
what P & R mean in a NER or a POS-tagging conext, then it is the same
thing for sentence-detection...
for example say you have a predictive model M. You train it on some data
X and you test it on some data Y.
-P is concerned with 'what proportion of the retrieved data, that are
'true positives' (they were correctly classified as relevant). In
sentence-detection, that would translate to 'how many of the recognised
sentences are actually correct?'
-R is concerned with 'what proportion of all the relevant data has been
retrieved'. In sentence-detection this translates to 'out of all the
correct sentences, how many did the model retrieve?'
I've always found the picture in [1] quite helpful
[1] https://en.wikipedia.org/wiki/Precision_and_recall
HTH,
Jim
On 05/08/13 12:29, Christopher Kotfila wrote:
Good morning!
I'm trying to get a better sense of how precision and recall are calculated
for the sentence detection module. The manual online does not seem to have
a through discussion of the topic, and while i've begun looking through the
source I am not an experienced Java programmer and so am having some
difficulty divining the theory behind numbers. Citations welcome!
Thanks!
Chris