for your information.

---------- Forwarded message ----------
From: Soon Hui Ngu <[email protected]>
Date: Thu, Aug 7, 2008 at 11:33 AM
Subject: Re: source codes compiled in VC++2008
To: 74yrs old <[email protected]>


Hi, sorry for the late reply.

I've seen your output, the chinese translation accuracies are not very good
as compared to english...the word to word accuracies ( the accuracies in
identifying chinese character) is 90%, there are 458 chinese words, and
about 40 words are not properly identified.

Besides that, some punctuations are misplaced. But overall a person literate
in chinese can still identify what the passage says.

On Thu, Aug 7, 2008 at 1:02 PM, 74yrs old <[email protected]> wrote:

> Soonhui,
> Awaiting anxious  to know your evaluation of output text of chinese
> generated.
> Greetings,
> -sriranga(75yrsold)
>
>
> On Wed, Aug 6, 2008 at 4:51 PM, 74yrs old <[email protected]> wrote:
>
>> I may kindly be informed percentage(or number of mistakes)  in the
>> chinese-output text.  From my experience, output generally have 95 to 98%
>> correct.
>>
>>
>> On Wed, Aug 6, 2008 at 4:22 PM, 74yrs old <[email protected]>wrote:
>>
>>> forwarded chinese-tessdata.zip
>>>
>>>
>>> On Wed, Aug 6, 2008 at 4:19 PM, 74yrs old <[email protected]>wrote:
>>>
>>>> Soon,
>>>> Without installing any fonts,  succeeded to generate bmp file
>>>> (attached herewith as compressed tif) as well as box. also attached
>>>> tesseract log report as well as output text.(all in zip)
>>>>
>>>> I think output appears to be perfect - of course there may few mistakes.
>>>> Kindly feedback about correctness/
>>>> -Greetings,
>>>>
>>>>
>>>>
>>>> On Wed, Aug 6, 2008 at 3:20 PM, 74yrs old <[email protected]>wrote:
>>>>
>>>>> Thanks I shall check and feedback to you.
>>>>>
>>>>> 2008/8/6 Soon Hui Ngu <[email protected]>
>>>>>
>>>>> Oh OK, sorry :)
>>>>>>
>>>>>> Here it is.
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 6, 2008 at 5:34 PM, 74yrs old <[email protected]>wrote:
>>>>>>
>>>>>>> It means  not similar to English - which has independent vowels.  As
>>>>>>> such complete set of Characters have to be trained.
>>>>>>>
>>>>>>> Required sample in *text form* (*Notepad - text*) - NOT
>>>>>>> image(bmp)file
>>>>>>>  Sample text (.txt) is required  to  generate image based on the text
>>>>>>> file in bbt tool..
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 6, 2008 at 1:38 PM, Soon Hui Ngu 
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>>> Hi, Mandarin has no dependent vowels. In fact, the whole concept of
>>>>>>>> vowel is alien in Mandarin.
>>>>>>>>
>>>>>>>> As for how to install Mandarin font, you may want to consult
>>>>>>>> http://www.yellowbridge.com/chinese/fonts.php for more information.
>>>>>>>>
>>>>>>>> I attach a sample text here.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 6, 2008 at 3:45 PM, 74yrs old 
>>>>>>>> <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> I like to know whether Mandarin has dependent vowels   ?
>>>>>>>>> Will you forward sample text  to enable me to generate sample
>>>>>>>>> datafiles and forward to you.
>>>>>>>>> Since in XP I could not locate Mandarin or chinese font - how to
>>>>>>>>> install the same in XP?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 6, 2008 at 11:17 AM, Soon Hui Ngu <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I'm a chinese, and I write Mandarin. I would be interested in
>>>>>>>>>> training Tesseract to recognize chinese words...not sure whether 
>>>>>>>>>> other devs
>>>>>>>>>> have done or not...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 6, 2008 at 1:20 PM, 74yrs old <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>> Thanks for the uploading in the forum - which will be benefited
>>>>>>>>>>> tesseract users.
>>>>>>>>>>>
>>>>>>>>>>> I am interested to know which mother tongue you speak and write.
>>>>>>>>>>> I am thinking to experiment in your local lang in tesseract.if 
>>>>>>>>>>> possible  and
>>>>>>>>>>> feedback to you
>>>>>>>>>>> - Cheers
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 6, 2008 at 6:10 AM, Soon Hui Ngu <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, thanks for your compliment.
>>>>>>>>>>>>
>>>>>>>>>>>> Ya, I think Ocropus is a good idea, will give it a try
>>>>>>>>>>>> sometime..
>>>>>>>>>>>>
>>>>>>>>>>>> As for which language I am going to train in Tesserract...well,
>>>>>>>>>>>> I haven't think of this issue yet... will think about this later...
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 6, 2008 at 1:48 AM, 74yrs old <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Soon,
>>>>>>>>>>>>> *Congratulations* !!
>>>>>>>>>>>>> Successfully generated exe files without any error in
>>>>>>>>>>>>> VC++2008.  All the exe files  are performed  very well without 
>>>>>>>>>>>>> any trouble -
>>>>>>>>>>>>> to train  Kannada script which have dependent vowels.
>>>>>>>>>>>>> I am thankful to you for  your modified source codes..
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which language you are going  to train in Tesserract?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since you are good programmer, why not compile the source codes
>>>>>>>>>>>>> of Ocropus in VC++2008  also for benefit of users. I am willing 
>>>>>>>>>>>>> to perform
>>>>>>>>>>>>> beta testing and feedback to you under your valuable guidance..
>>>>>>>>>>>>>
>>>>>>>>>>>>> With Best of Luck,
>>>>>>>>>>>>> -sriranga(75yrsold)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 5, 2008 at 4:33 PM, 74yrs old <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> Thanks for the same. I shall test and feedback to you.
>>>>>>>>>>>>>> With Regards,
>>>>>>>>>>>>>> -sriranga(75yrsold)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 5, 2008 at 2:12 PM, Soon Hui Ngu <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's my modified version. Do contact me if you have
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 5, 2008 at 3:57 PM, 74yrs old <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>  Will you kindly forward zipped source codes of tesseract
>>>>>>>>>>>>>>>> 2.03
>>>>>>>>>>>>>>>> already modified in  VC++2008 by you for beta testing and
>>>>>>>>>>>>>>>> feedback to you. I have installed VC++2008.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would have done myself  by replacing as suggested by you.
>>>>>>>>>>>>>>>> But I find difficult to do so - due to overaged and vision
>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>> As such, you need not take trouble of correcting  - in other
>>>>>>>>>>>>>>>> words
>>>>>>>>>>>>>>>>  simply what you have already done(modified), the same  be
>>>>>>>>>>>>>>>> zipped direct
>>>>>>>>>>>>>>>> to me to have hands on experience.
>>>>>>>>>>>>>>>> With Best of Luck,
>>>>>>>>>>>>>>>> -sriranga(75yrsold)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> http://itscommonsensestupid.blogspot.com/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> http://itscommonsensestupid.blogspot.com/
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> http://itscommonsensestupid.blogspot.com/
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> http://itscommonsensestupid.blogspot.com/
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> http://itscommonsensestupid.blogspot.com/
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


-- 
http://itscommonsensestupid.blogspot.com/

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to