please see visit http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract wherein how to train has been explained.. Please ensure to save as Utf-8 in the notepad, if the chinese is of utf-8 Otherwise ??? will appear. Cheers
On Sun, Feb 7, 2010 at 7:13 PM, [email protected] <[email protected]> wrote: > How did you generated (train) data for tesseract 3.0? > > On 22. Jan, 07:57 h., 74yrs old <[email protected]> wrote: > > chinese simplified datafiles not uploaded in the svn. > > Myself generated and tested the chinese(simplified) eventhough I don't > know > > chinese lang. I have forwarded > > copy of posts on subject noted below > > source codes compiled in VC++2008 > > > > wherein I had forwarded the trained data files to Soon for his comments. > > This may help you. > > Cheers. > > > > On Fri, Jan 22, 2010 at 9:29 AM, andrei_c <[email protected]> wrote: > > > I don't see Chinese Simplified data file in repository at > > >http://tesseract-ocr.googlecode.com/svn/trunk/tessdata/. Is it > > > checked in elsewhere? > > > > > Andrei > > > > > On Jan 21, 6:42 pm, 74yrs old <[email protected]> wrote: > > > > yes. check out the svn repository from goole code. tesseract 3.0 can > > > handle > > > > simplified chinese which I had tested . You can try as experiment. > > > > > > On Fri, Jan 22, 2010 at 6:54 AM, Zhuguo Shi <[email protected]> > wrote: > > > > > Hi, > > > > > > > I know this question might be too basic but I am really new to > > > tesseract. > > > > > Where can I find the latest 3.0 code? Just check out the SVN > repository > > > from > > > > > Google code? And, can tesseract 3.0 handle Chinese properly now? > > > > > > > -- > > > > > You received this message because you are subscribed to the Google > > > Groups > > > > > "tesseract-ocr" group. > > > > > To post to this group, send email to > [email protected]. > > > > > To unsubscribe from this group, send email to > > > > > [email protected]<tesseract-ocr%[email protected]> > <tesseract-ocr%[email protected]<tesseract-ocr%[email protected]> > > > > > <tesseract-ocr%2bunsubscr...@goog legroups.com> > > > > > . > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/tesseract-ocr?hl=en. > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "tesseract-ocr" group. > > > To post to this group, send email to [email protected]. > > > To unsubscribe from this group, send email to > > > [email protected]<tesseract-ocr%[email protected]> > <tesseract-ocr%[email protected]<tesseract-ocr%[email protected]> > > > > > . > > > For more options, visit this group at > > >http://groups.google.com/group/tesseract-ocr?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

