[tesseract-ocr] Pros and cons of .tiff vs .png

2020-01-28 Thread teksts
Hi all, I'm fairly new to tesseract (and to programming work in general), and am trying to get my bearings. Almost everything I have seen recommends/assumes that I feed .tiff files into tesseract to be ocr'd, but I recently came across some posts suggesting that .png is less finicky, and might

[tesseract-ocr] Simplest way to automate pdf to tif?

2020-01-14 Thread teksts
Hi all, I have a very large number of PDFs to convert to .tif files to be processed by Tesseract. While I've been getting acquainted with Tesseract, I've just be converting them manually through Adobe Acrobat, but I'd like to automate the process. Any advice? Thanks -- You received this mess