devenagari.unicharset, Latin.unicharset and radical-stroke.txt The script unicharset are useful in setting character properties. For most scripts they are already available in langadata_lstm. I don't think they are mandatory for lstm training but by copying them once you can avoid the warning messages.
radical-stroke.txt is used only for CJK languages, but tesseract checks for it during training process, so you need to make it available. For chattisgarhi, if training for as written in Devanagari, I will suggest training from script/Devanagari.traineddata rather than English. Please note if you are starting from scratch, then you don't need a starting traineddata. If you use one, then you are finetuning. Finally, you need to use the correct mode for Indic language with unicharset_extractor. Your unicharset should have Unicode codepoints, not akshara (consanant vowel sign combination). -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUihe91fFpd%3DJX5SF6rQvW60j3SjnqO11DMqorxfsRA5A%40mail.gmail.com.

