Hey Tom, I am working with bengli OCR text extraction using tesseract. How can I train, fine tune tessearct OCR for bengali? Can I contribute to tesseract open source by including my dataset to tesseract or any other way?
On Wed, Jun 11, 2025, 8:43 PM Tom Morris <[email protected]> wrote: > Tesseract will be unlikely to perform well for Rashi script > <https://en.wikipedia.org/wiki/Rashi_script> without retraining, but you > might want to check out the language files from this project: > https://gitlab.com/pninim.org/tessdata_heb_rashi > > Tom > > On Tuesday, June 10, 2025 at 6:58:54 AM UTC-4 הברנש wrote: > >> Would someone be able to help me with OCR for this book in Rashi script? >> Here is the book >> <https://drive.google.com/file/d/1LYf4GpOyk0ltCHranPWOmLjmaIK6h-JI/view?usp=sharing> >> , >> thanks! > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/tesseract-ocr/88d59cf3-c5b5-4c8e-b395-95b718be4724n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/88d59cf3-c5b5-4c8e-b395-95b718be4724n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/CALW6Jrj0bw66HEdikTfVixoNNuUH9e46YiDnVbqZt9_m5r8oVg%40mail.gmail.com.

