Today I downloaded vietocr NET 1.7 32 zip from http://sourceforge.net/projects/vietocr/. Apart from viet it is also supported other Indic lang including Kannada and also other lang of the world. The said frontendGUI has built-in post processor program for DangAmbig. which is UTF-8. I think problem of simple program or script as suggested by Martin Pierre is now solved. Only test has to be performed. With Regards, -sriranga(77yrsold)
On Thu, Apr 15, 2010 at 12:40 PM, 74yrs old <[email protected]> wrote: > Pierre, > Thanks for the clarification. I am explaining how combination of(C) > generated by merging (A)consonant + (B)dependent vowel as noted below: > > (C) (A) <- (B) (C) <- (A) (B) (C) > (A) (B) > *ದೇ = ದ + ೇ ಗೋ = ಗ + ೋ* * ಸೌಂ = ಸೌ + ಂ*. > Try <- > <- <- > *ದೇ = ದ ೇ ಗೋ = ಗ ೋ* * ಸೌಂ = ಸೌ ಂ* Here you can Try/test how(B) > dependent vowel merged with (A)consonant if pressed backspace key the B > above(say*ೇ* towards A(say *ದ* ) You will notice that (B)will merge with > (A) smoothly and become (C) > > In such cases whether (A) and (B) have to be trained as separated symbols > and simple program is required to merge (B) with (A) to become (C). I am > trying to get simple program from this forum for the past 2-3 years. > Unfortunately no one is is able to write simple program or script for > tesseract for Indic + other wold Lang which have consonant plus dependent > vowels > I seek your valuable guidance. > With Regards, > -sriranga(77yrsold) > > > > On Wed, Apr 14, 2010 at 7:22 PM, MARTIN Pierre <[email protected]>wrote: > >> As a *last* Point "3) In tif file, I observed six or seven times are >> repeated same para. i am interested to know your logic for repeating paras. >> It is presumed that *one line* sentence is sufficient for training >> purpose OR *more than one line* of same sentence should be repeated in >> tif for training purpose." >> >> Answered already, in the same mail. Again: Each paragraph is written using >> the same font, but with different anti-aliasing. You won't notice it at >> naked eye, i'm not even sure it's on the TIF file i've provided. >> >> Request for your valuable guidance for the above point for my knowledge. I >> don't trouble you anymore. >> Awaiting your valuable guidance and *screenshot you have marked in RED*. >> >> I've sent it, please check previous mails. Attached is again. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<tesseract-ocr%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

