Thanks, TP,

As I mentioned in another thread regarding unicharambigs, many times the
OCRed text does not match ground truth, even when the BOX file was
generated using the same text and trained with it. As far as I can figure
out, it seems to be related to chopping of the word.

Many times, I am able to figure out the change to be made in box file and
unicharambigs file to fix it. Some times, I am not sure whether the
characters need to be given as one unit or two units in unicharambigs file
eg. I have both of the following in my unicharambigs file

2    ग् ाी    1    गी    1
1    ग्ाी    1    गी    1

I was hoping that debug values or scrollview will help me find out what
values are being put there by Tesseract.

Maybe, a simpler solution is some utility which takes the unicode text and
converts it into the codes so that I know what is being used there. eg.
ग्ाी is U+0917U+094DU+0940 , I am assuming that the other case would have a
'space' character between them.

Anyway, That was the reason for wanting to follow the program in VS2008. If
you know of some instructions/tutorial to do that and can point me to it,
that will be great.

Thanks,

Shree

Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com


On Tue, May 7, 2013 at 11:18 PM, TP <wing...@gmail.com> wrote:

>
> On Tue, May 7, 2013 at 6:11 AM, sdk <shreesh...@gmail.com> wrote:
>
>> My question is, can that setup be used to trace the program flow or see
>> how the processing is being done.
>
>
>
> Yes, but why do you ask. Are you having problems?
>
> You might have to also compile leptonica, if you want to step into its
> functions.
>
> The main difference are that the Express edition:
>
> 1) Doesn't have Solution folders so some of the Leptonica Solution
> organizational enhancements are missing.
>
> 2) Can't use Addins (really only an issue when building Leptonica's extra
> sample /prog programs).
>
> 3) Doesn't include the Resource Editor (but you can still manually edit
> .rc files directly).
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to