tesestrain.sh is setup to process files in batches of 8 simultaneously. Are
you allowing the script to run to completion?

On Fri, 4 Jan 2019, 11:27 <[email protected] wrote:

> Hey all,
>
> I'm currently working on a program that explores the handwritten OCR
> capabilities of Tesseract.
>
> I have ~1400 images with ~8 lines of handwritten textlines per image with
> accompanying BOX files. Additionally, I've got a couple of handwritten
> fonts that I'm using to bootstrap the training process.
>
> One problem I'm having is that when I invoke tesstrain.sh, it will
> consitently fail at some point (mostly around Phase E) when more than 7
> box/tif pairs or fonts are provided as input. I've tried combinations where
> all the inputs are font files, all inputs are handwritten tif/box pairs,
> and inputs as a mix of the two.
>
> I had originally tried using Shree's modified boxtrain files but was
> receiving an error that had to do with failing to read in a unicharset
> file. So, I modified tesstrain.sh and tesstrain_utils.sh (referencing
> Shree's modified scripts) myself to work with my own provided tif/box pairs.
>
> Is there a limit to the number of inputs to tesstrain.sh that should be
> followed or should I confidently be able to give tesstrain.sh all 1400 of
> my images no problem?
>
> Thanks,
> Tim Snyder
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/dba86440-e325-4156-bfc7-85a1a680c63e%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/dba86440-e325-4156-bfc7-85a1a680c63e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXjAtUMW-g6oDdVvA2F%3DwhBLeTp-QjK5H4XXbCG78tnuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to