Dear all , 

I am trying to run a mass training with tesstrain.h (Have applied patch too 
<https://code.google.com/p/tesseract-ocr/source/diff?spec=svn93f7899a9e9afa5411eca6b4ec4831d0b49236f5&name=93f7899a9e9a&r=93f7899a9e9afa5411eca6b4ec4831d0b49236f5&format=side&path=/training/tesstrain.sh>
 
) . Still I am not able to clear my hurdles . 
This is the command which I used 

./tesstrain.sh \
        --bin_dir /usr/local/bin/ \
        --fonts_dir /usr/share/fonts/ \
        --lang tam \
        --langdata_dir /home/tesseract/training/langdata \
        --output_dir /home/tesseract/tam_train/output/ \
        --training_text /home/sibi/Desktop/outputscrambled.txt \
        --wordlist /home/sibi/Desktop/word_list_lexicon.txt \
        --tessdata_dir /usr/local/share/tessdata \
        --fontlist "TAU_VASN"

and I got the following output 

tee: /tam/tesstrain.log: No such file or directory

=== Starting training for language 'tam'
tee: /tam/tesstrain.log: No such file or directory
Cleaning workspace directory /tam...
mkdir: cannot create directory ‘/tam’: Permission denied
tee: /tam/tesstrain.log: No such file or directory

=== Phase I: Generating training images ===
tee: /tam/tesstrain.log: No such file or directory
Rendering using TAU_VASN
tee: /tam/tesstrain.log: No such file or directory
[Thu Apr 16 20:01:01 IST 2015] /usr/local/bin//text2image --leading=32 
--fonts_dir=/usr/share/fonts/ --box_padding=0 --strip_unrenderable_words 
--char_spacing=0.0 --exposure=0 --font=TAU_VASN 
--outputbase=/tam/tam.TAU_VASN.exp0 
--text=/home/sibi/Desktop/outputscrambled.txt
tee: /tam/tesstrain.log: No such file or directory
Initializing fontconfig
Could not find font named TAU_VASN
FLAGS_find_fonts || 
FontUtils::IsAvailableFont(FLAGS_font.c_str()):Error:Assert failed:in file 
text2image.cpp, line 417
tee: /tam/tesstrain.log: No such file or directory
ERROR: Program text2image failed. Abort.


What exactly does //Could not find font named TAU_VASN// mean ? I read the 
tesstrain.sh introduction again , which quotes 
"# NOTE: The font names specified in --fontlist need to be recognizable by 
Pango using  fontconfig. An easy way to list the canonical names of all 
fonts available on
 your system is to run text2image with --list_available_fonts and the  
appropriate --fonts_dir path."

And hence I performed the following command 

 text2image --list_available_fonts --fonts_dir usr/share/fonts 
For which I got the output as 

Initializing fontconfig

I am not able to get any interpretable data from this . Though the font is 
present in usr/share/fonts why is it not recognising it .

Once I am able to clear this , I will start looking at what mistakes I made 
in the other parameters and start correcting them . If community members 
are able to point out mistakes in the paramters it would be great . 

-Sibi




-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/34afd81d-2845-4fb7-b096-8a812b4a595a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to