What is  `Tesseract 4 XD`?  What does that mean `I then found out the hard
way that ...` ????

Zdenko


ne 6. 7. 2025 o 16:18 Alessandro Griseta <[email protected]> napĂ­sal(a):

> I tried manually adding files I needed from
> https://github.com/tesseract-ocr/tessdata_best (`equ.traineddata`,
> `osd.traineddata`, `ita.traineddata`) inside
> `/usr/share/tesseract-ocr/5/tessdata`: unfortunately I then found out the
> hard way that these only work on Tesseract 4 XD.
>
> 1. It seems funny though: does that really mean I'll get better results by
> downgrading so that I can actually use these files?
>
> I understand the performance loss, but I'm particularly interested in
> getting the best of `equ.traineddata`, which to my understanding interprets
> math characters, which are often a challenge for OCR engines, so was trying
> to get the absolute best scan possible for that.
>
> 2. Also, I wasn't able to specify `-l equ` as the error told me Tesseract
> is supposed to deal with that on its own: if that's the case, is `equ`
> installed by default with `sudo apt-get install tesseract-ocr` (couldn't
> find it in `tessdata` folder, and don't know where else to look for it)?
>
> 3. I also tested the Docker image: if I put `equ.traineddata` and
> `osd.traineddata` inside the `tessdata` folder will they (which I have
> chosen manually) actually be used?
>
> Hope this all makes sense, don't be afraid to ask :)
> Alessandro
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/tesseract-ocr/789d7514-bded-49e4-95ed-44cfb0049ad1n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/789d7514-bded-49e4-95ed-44cfb0049ad1n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wUFY%3D%2BxOxAcXRSHgQNn6UTsgP9H7go6Pir-e8H6R4Fvg%40mail.gmail.com.

Reply via email to