[iText-questions] how can i simply check if a PDF contains Searchable Text?

Aaron J Weber Tue, 24 Jan 2006 07:01:05 -0800

I need to workaround a known Acrobat Capture issue where it fails to OCR a PDF (even after it's converted to TIFF?) if the PDF already contains text (i.e. is not a PDF-Image-Only document).

Since I'm batch-processing, I don't have the luxury of opening each document and trying to select some text from Acrobat to test. I need an automated method of pre-validating the PDF so it doesn't cause Capture to hang.

Thus my question: Does anyone have a snippet that "peeks" at a PDF's contents and checks to see if it's PDF+Text (of some kind)? Since I don't have a standard "search string" to check for, I guess I'd be content just checking if ANY text is readable/exists in the PDF or if the PDF is "Image Only".

Thanks in advance,

[iText-questions] how can i simply check if a PDF contains Searchable Text?

Reply via email to