RE: Extract Text from a TIFF image

2016-07-19 Thread Gordon Schneider
I installed tesseract on my PC. I ran tesseract on its own using the following command: tesseract.exe x:/java/PDFBox/Maxfield-1.tiff x:/java/PDFBox/Maxfield-1 The results are in the attached file. Not as clean as the results Timothy got. I am closer to where I want to get to but obviously I am

RE: Extract Text from a TIFF image

2016-07-19 Thread Allison, Timothy B.
You might want to experiment with different -psm values, we use 1 by default. Also, which version of Tesseract? I think I got mine from (https://github.com/UB-Mannheim/tesseract/wiki), version: tesseract 3.05.00dev leptonica-1.73 libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.6.

RE: Extract Text from a TIFF image

2016-07-19 Thread Gordon Schneider
I found version 3.02.02. I will download a more current version. Thanks Gord From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: July 19, 2016 9:58 AM To: user@tika.apache.org Subject: RE: Extract Text from a TIFF image You might want to experiment with different -psm values, we use 1