[tesseract-ocr] tesseract osd retraining and script vs language text extraction

Omesharma Sat, 31 Oct 2020 02:10:05 -0700

#Hey
---------------------
##i am Using Tesseract OCR for the text extraction form the image :


-------------------------

--------------------
##I need your valuable suggestion for the below mentioned points.
-------------------------
- How can i Retrain osd.traindata file for adding Ethiopic and other 
scripts , because current osd.traindata file unable to detect few scripts 
name eg:(ethiopic , gujarati, gurmukhi) but script files for them are 
available in script directory.
------------------------
---------------------
- which is more accurate for text extraction [LANGUAGE TRAIN DATA FILES]  
or [SCRIPT TRAIN DATA FILES]
---------------
------------------
- Does it make nay difference to use the script for text extraction instead 
of language.traindata in term of text extraction accuracy.
-----------------------
---------------------------
Please Share your Views for above list as per your experience with 
tesseract. it'll be very helpful for my final year project.

------------------------------
Contact: sharmaome...@gmail.com .
---------------------------------------

Thanks and regards
Omesh sharma

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/cdf22a47-055b-4fbd-8179-fa3f68a9aff8n%40googlegroups.com.

[tesseract-ocr] tesseract osd retraining and script vs language text extraction

Reply via email to