Re: [tesseract-ocr] Can tesseract be used to read a PDF and OCR it to text?

2020-01-14 Thread JB Data31
OCRmyPDF do the job. Linux native, but windows available : https://ocrmypdf.readthedocs.io/en/latest/installation.html#installing-on-windows. 2020-01-13 7:49 UTC+01:00, 'pjfarley3' via tesseract-ocr : > > > On Sunday, January 12, 2020 at 8:52:51 PM UTC-5, shree wrote: >> >> Tesseract reads only

[tesseract-ocr] Training Tesseract 5.0.0 to recognize digital handwriting

2020-01-14 Thread 'Fabio Lugli' via tesseract-ocr
Hello everyone, i'm trying to train tesseract on handwriting, knowing that it's not the best option, using the latest version available for Windows. I have access to a huge amount of .tif files, lines of handwritten text, i'm able to obtain the .box files, which I later edit to be compliant to t

[tesseract-ocr] Simplest way to automate pdf to tif?

2020-01-14 Thread teksts
Hi all, I have a very large number of PDFs to convert to .tif files to be processed by Tesseract. While I've been getting acquainted with Tesseract, I've just be converting them manually through Adobe Acrobat, but I'd like to automate the process. Any advice? Thanks -- You received this mess

Re: [tesseract-ocr] Simplest way to automate pdf to tif?

2020-01-14 Thread Marco Atzeri
Am 14.01.2020 um 19:13 schrieb teksts: Hi all, I have a very large number of PDFs to convert to .tif files to be processed by Tesseract. While I've been getting acquainted with Tesseract, I've just be converting them manually through Adobe Acrobat, but I'd like to automate the process. Any ad

[tesseract-ocr] Re: Retrieve HUD text from a video game screenshot

2020-01-14 Thread 'Richard' via tesseract-ocr
Thank you very much Daniel. This is very handy for sure, I can see every step of image modification in PyCharm now :) I tried to transform image in different orders (crop first, gray, thresh; gray first, crop then tresh etc) but still there is no good result. Do you have some interesting material