Going out on a limb here, but does '-l eng' on its own deliver any text for
you?

The next thing I would look into, if I were you, is whether my 'eng'
traineddata has the same (lstm aka v4, I suppose) support listed as your
gdt traineddata. I've seen it happens where those do not align.

There's a tesseract tool to list the traineddata engine features (forgot
the name/cli Argos, sorry) and one to merge traineddata files
(combine_something, but I have to look it up, so you'll be as fast as me
with Google + doc search), but my *hunch* is that you wont need the combine
tool; what I've seen so far is tesseract picks an engine (psm setting
drives this, IIRC) and then pumps the image through all loaded languages on
a segment by segment basis. (IIRC, so YMMV ;-) )

(The bit I'm wondering about now myself is: there was some sort of
criterium in there, in the code, when to decide to try? or use? multiple
lang results; it just /might/ be that's causing trouble, but I would have
to dig deep into the code for that and it doesn't rate above "wild crazy
guess" anyway, so better take the same route and check your installed 'eng'
database is doing what it's supposed to, on its own, first.

The next sane thing to try is flipping them around, ie "eng+gdt" instead of
"gdt+eng", to see if results change and /how/, as that might give us all a
hint about what's going on in there.





On Mon, 20 Nov 2023, 09:23 Simon, <smong5...@gmail.com> wrote:

> Hello everybody,
>
> right now I am working with tesseract to train it new symbols. Therefore I
> used tif pictures with only the desired symbol in it. I trained with
> tesstrain Repository and about 4000 training images. At the end of the
> procedure I got the traineddata file for my model Common_gdt.
> Except of the symbol(s) I trained in the model Common_gdt also numbers
> should be recognized. Obviously if I only use Common_gdt Tesseract only
> recognizes the symbols trained for but no numbers.
> To solve this problem I used -l Common_gdt+eng which should use both
> traineddata files. But when I use these files like this, It is like "eng"
> doesn't do anything. The results are the same, as I used only Common_gdt.
>
> Does anyone have an idea how traineddata files can be combined?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/9ee1df96-eef7-4f93-b93a-2c7914ab52c9n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/9ee1df96-eef7-4f93-b93a-2c7914ab52c9n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAFP60fp1%2B%2BO-wLr99yPkX3uJjG61zdDXR7_ygwZ82jvEee6Aww%40mail.gmail.com.

Reply via email to