Tesseract support uzn file[1] with psm 4. Seach forum for more details

[1] https://github.com/OpenGreekAndLatin/greek-dev/wiki/uzn-format


Zdenko


pi 23. 9. 2022 o 17:20 Vincent Sarbach-Pulicani <[email protected]>
napĂ­sal(a):

> Hello,
> I'm working on historical newspaper from the interwar period written in 3
> different languages : corsican, french and italian.
> After many tries, Tesseract seems to be the best OCR for me but the layout
> analysis of a newspaper is complex.
> However, using the API of Gallica (French national library), I can have
> access to an OCR (bad quality) and usable ALTO files.
> My question is : can I use those ALTO files to make Tesseract follow the
> same segmentation as the basic OCR?
> I don't know if my question makes sense.
> Thanks a lot,
> Vincent Sarbach-Pulicani
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/334be2c9-a194-46ee-adcb-ab48b712e3b8n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/334be2c9-a194-46ee-adcb-ab48b712e3b8n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8z22bwiE2JEsq4kHn9xoFTsMw%2BdyS70pO9aS4%2BwaO%2BOaw%40mail.gmail.com.

Reply via email to