PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com _____________________________________________________________
Stan,
You can count the words of a PDF document with a script (method getPageNumWords).
If the results is 0 words, the PDF document has not been OCRed (or has given a 0 result if the whole document is a document image).
In other cases, either the document has been OCRed, or it was a "normal" PDF.
You can use the exemple in "Acrobat JavaScript Object Specification" page 122 as follows:
Example:
// count the number of words in a document
var cnt=0;
for (var p = 0; p < this.numPages; p++)
cnt += getPageNumWords(p);
console.println("There are " + cnt + " words on this page.");According to your context, the script could be inserted at folder level, document level, field level, or into a batch sequence.
Regards Michel Lausseur Aalto Conseil
Stan Guzik wrote:
Does anyone know of a tool that will check if a PDF is OCR’d. I don’t need it to OCR the PDF, only check if it is.
Thanks, Stan
To change your subscription: http://www.pdfzone.com/discussions/lists-pdfdev.html
