How to automatically evaluate the quality of the text extraction result by PDFBox?

Qingchao Kong Sat, 10 May 2014 22:35:26 -0700

Hi, I am using PDFBox to extract text from PDF files.
As you know, due to some reason, PDFbox might produce errors when
extracting text from some PDF files, the question I want to ask is
that: is there a way to automatically evaluate the quality of text
extraction result? Or can PDFBox offer a confidence score about the
extracted text result?


Regards,

How to automatically evaluate the quality of the text extraction result by PDFBox?

Reply via email to