tesestrain.sh is setup to process files in batches of 8 simultaneously. Are you allowing the script to run to completion?
On Fri, 4 Jan 2019, 11:27 <[email protected] wrote: > Hey all, > > I'm currently working on a program that explores the handwritten OCR > capabilities of Tesseract. > > I have ~1400 images with ~8 lines of handwritten textlines per image with > accompanying BOX files. Additionally, I've got a couple of handwritten > fonts that I'm using to bootstrap the training process. > > One problem I'm having is that when I invoke tesstrain.sh, it will > consitently fail at some point (mostly around Phase E) when more than 7 > box/tif pairs or fonts are provided as input. I've tried combinations where > all the inputs are font files, all inputs are handwritten tif/box pairs, > and inputs as a mix of the two. > > I had originally tried using Shree's modified boxtrain files but was > receiving an error that had to do with failing to read in a unicharset > file. So, I modified tesstrain.sh and tesstrain_utils.sh (referencing > Shree's modified scripts) myself to work with my own provided tif/box pairs. > > Is there a limit to the number of inputs to tesstrain.sh that should be > followed or should I confidently be able to give tesstrain.sh all 1400 of > my images no problem? > > Thanks, > Tim Snyder > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/dba86440-e325-4156-bfc7-85a1a680c63e%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/dba86440-e325-4156-bfc7-85a1a680c63e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXjAtUMW-g6oDdVvA2F%3DwhBLeTp-QjK5H4XXbCG78tnuQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

