No need to change "Tesseract executable" setting. You need an entry in .font_properties file for arialunicodems font.
I strongly suggest you re-read the training wiki before continuing on. https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 On Thursday, November 20, 2014 8:07:35 AM UTC-6, iram akbar wrote: > > it seems its a known issue of Serak. i have created the "ara" folder with > files as "vie" folder in jtessbox editor as you can see in attachment. > after that i have set the box file path in jtessbox editor of "Tesseract > executable" and "Training data" for "ara" as attached. when i click the > "Run" button i got the attached error. i don't know what goes wrong here. > Question: m i giving the wrong file in the path in "Tesseract executable" > and "Training data" i.e ara box file? or what goes wrong. > note: i have put no data words_list, frequent_words, font_properties file. > > > On 20 November 2014 17:32, ShreeDevi Kumar <shree...@gmail.com > <javascript:>> wrote: > >> I have not used Serak - but the issues page there indicates problems with >> RTL languages - see >> https://code.google.com/p/serak-tesseract-trainer/issues/detail?id=6 >> >> why are u not using jtessbox editor's trainer or the command line >> programs? I think the binaries are bundled with JTess... >> >> >> >> ShreeDevi >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> >> On Thu, Nov 20, 2014 at 4:26 PM, iram akbar <irama...@gmail.com >> <javascript:>> wrote: >> >>> Hello shree, >>> >>> i am having an issue while training arabic in Serak (for box file >>> generation i am using jtessbox editor). i am doing some testing. i have >>> assigned english alphabet for a single arabic word and created the box >>> file as attached (jtessbox file). now following all training process in >>> serak i got the OCR result as attached. although you can see in the box >>> file there is 4 alphabets "A,B,C,D" but i was expecting OCR result will be >>> ABCD but the results are BDBBAABBBBA as attached (serak result). >>> Question: why i a getting that result? is it some wrong while making box >>> file in jtessbox editor or training in serak? >>> >>> On Monday, 10 November 2014 15:30:21 UTC+5, shree wrote: >>>> >>>> Look under jtessboxeditor/samples/vie folder >>>> >>>> and create similar files for your language >>>> >>>> ShreeDevi >>>> ____________________________________________________________ >>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>> >>>> On Mon, Nov 10, 2014 at 1:10 PM, iram akbar <irama...@gmail.com> wrote: >>>> >>>>> Quan, >>>>> i am able to generate some files with jtess ox editor but i am having >>>>> an issue, when i select "Train with existing box" or "Train from Scratch" >>>>> under the *Traine*r tab i am getting this attached message. >>>>> Question: How i can generate the Arabic.font_properties, >>>>> Arabic.frequent_word_list and Arabic.words_list files using jtessbox >>>>> editor? >>>>> >>>>> On Friday, 7 November 2014 19:42:37 UTC+5, Quan Nguyen wrote: >>>>>> >>>>>> Look in samples folder for a working example. You can start out from >>>>>> a UTF-8 text file about 2-page long, generate TIFF/Box from it, and >>>>>> prepare >>>>>> other necessary input files for training. You can train entirely in >>>>>> jTessBoxEditor. >>>>>> >>>>>> On Thursday, November 6, 2014 6:19:53 AM UTC-6, iram akbar wrote: >>>>>>> >>>>>>> thank you for your help but my issue still exits. if i need to >>>>>>> generate the Tiff of an image text i am unable to generate the TIFF as >>>>>>> it >>>>>>> only ask to load the text file not image file. second if i have a lots >>>>>>> of >>>>>>> documents i need to copy paste first then generate the TIFF. Any one >>>>>>> else >>>>>>> can help me in this. >>>>>>> Question: how can i Input the Arabic text image in jtessbox editor >>>>>>> to generate Tiff (as attached). >>>>>>> >>>>>>> On Thursday, 6 November 2014 16:38:25 UTC+5, shree wrote: >>>>>>>> >>>>>>>> Click on the 'generate' box - with some devanagri fonts I have >>>>>>>> found that text does not display but the tiff/box are generated. Maybe >>>>>>>> same >>>>>>>> for the arabic font you are using. Give it a try. >>>>>>>> >>>>>>>> You can also try to copy and paste the text, sometimes that works. >>>>>>>> >>>>>>>> >>>>>>>> ShreeDevi >>>>>>>> ____________________________________________________________ >>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesseract-oc...@googlegroups.com. >>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/tesseract-ocr/d7396d3d-c4d1-4fcc-a58d-6cc02927989c% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/d7396d3d-c4d1-4fcc-a58d-6cc02927989c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@googlegroups.com <javascript:>. >>> To post to this group, send email to tesser...@googlegroups.com >>> <javascript:>. >>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/1422c53d-8ad5-4339-8e4a-3de540a3dfa5%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/1422c53d-8ad5-4339-8e4a-3de540a3dfa5%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "tesseract-ocr" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/tesseract-ocr/QQ8wC59YKUI/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> tesseract-oc...@googlegroups.com <javascript:>. >> To post to this group, send email to tesser...@googlegroups.com >> <javascript:>. >> Visit this group at http://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWieFAj7ZnJKRTYPwL-UzJWnTK-wRSFPZgOEy-%2BM4D4-g%40mail.gmail.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWieFAj7ZnJKRTYPwL-UzJWnTK-wRSFPZgOEy-%2BM4D4-g%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1998ed6e-144e-4d5d-8a4e-eafd8794f062%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.