[tesseract-ocr] Tesseract makebox config with known lines of text

a.f...@sheffield.ac.uk Wed, 15 Jul 2020 03:38:20 -0700

I'm using a loop around "tesseract $X $X batch.nochop makebox" to produce 
box files to be corrected and re-used for training, and have two questions.


Is there a way to make it produce the line-by-line format (rather than 
character-by-character) that newer versions of tesseract support as 
training data? (I'm using tesseract 4.0.0 in a docker container.)

I have a TSV file (which I could transform into some other format) with the 
correct string for the text in each image file, but it does not have the 
pixel locations. Is there any way to tell tesseract makebox to use those 
strings and "make them fit" the image?

Thanks,
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c6c81f79-3747-475e-a95d-6957e846098cn%40googlegroups.com.

[tesseract-ocr] Tesseract makebox config with known lines of text

Reply via email to