the extended character it's not in the ara.punc
On Monday, November 20, 2023 at 3:44:52 PM UTC+1 elvi...@gmail.com wrote: > Can you try to remove it from the list of punctuations? > > To do that, you need to extract the components of the traineddata file, > edit the ara.punc file, and then recombine them. > > To extract the components: *combine_tessdata -d ara.traineddata* > > > On 20 Nov 2023 at 4:39:29 PM, Sifdin Nahhas <sifd...@gmail.com> wrote: > >> Hey guys, >> so i have problem where tesseract remove Extender letter in arabic "ـ" >> because it recognize it as underline like the images bellow >> i think it because of some configuration varaibles but i could not find >> the responsable one >> >> appreciate the help. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to tesseract-oc...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/911e8ef4-68f3-4e9d-b40b-e7a715ab912cn%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/911e8ef4-68f3-4e9d-b40b-e7a715ab912cn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2097f488-c2be-4be1-a6d1-3563795efbfbn%40googlegroups.com.