PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com _____________________________________________________________

Stan,
You can count the words of a PDF document with a script (method getPageNumWords).
If the results is 0 words, the PDF document has not been OCRed (or has given a 0 result if the whole document is a document image).
In other cases, either the document has been OCRed, or it was a "normal" PDF.


You can use the exemple in "Acrobat JavaScript Object Specification" page 122 as follows:

Example:
// count the number of words in a document
var cnt=0;
for (var p = 0; p < this.numPages; p++)
cnt += getPageNumWords(p);
console.println("There are " + cnt + " words on this page.");

According to your context, the script could be inserted at folder level, document level, field level, or into a batch sequence.

Regards
Michel Lausseur
Aalto Conseil

Stan Guzik wrote:

Does anyone know of a tool that will check if a PDF is OCR’d. I don’t need it to OCR the PDF, only check if it is.

Thanks,
Stan



To change your subscription:
http://www.pdfzone.com/discussions/lists-pdfdev.html



Reply via email to