Re: VS2008 Express Edition - how to use this to see debug values?

2013-05-07 Thread Shree Devi Kumar
Thanks, TP, As I mentioned in another thread regarding unicharambigs, many times the OCRed text does not match ground truth, even when the BOX file was generated using the same text and trained with it. As far as I can figure out, it seems to be related to chopping of the word. Many times, I am a

Re: Diff between unicharambigs and DangAmbigs

2013-05-07 Thread Shree Devi Kumar
Matt, I am also facing similar issues with unicharambigs. I have found it helpful to look in the unicharset file. Some times the values being recognized are not what was put in the box file and looking at the unicharset helps identify them. In fact there are times when I have had to define the ch

Re: Diff between unicharambigs and DangAmbigs

2013-05-07 Thread matthew christy
Thanks for the information Nick. I tried my experiment and used the unicharambigs file to turn all my ligatures into modern character equivalents. It did not substantially improve the dictionary lookup results. I'll have to try increasing my confidence in the dictionary using the parameters tha

Re: VS2008 Express Edition - how to use this to see debug values?

2013-05-07 Thread TP
On Tue, May 7, 2013 at 6:11 AM, sdk wrote: > My question is, can that setup be used to trace the program flow or see > how the processing is being done. Yes, but why do you ask. Are you having problems? You might have to also compile leptonica, if you want to step into its functions. The mai

VS2008 Express Edition - how to use this to see debug values?

2013-05-07 Thread sdk
Hi, I installed VS 2008 Express edition and installed the Tesseract 3.02 project in it based on the instructions given in http://tesseract-ocr.googlecode.com/svn/trunk/vs2008/doc/setup.html The instructions were very clear and easy to floow and I have built Tessearct using that. My question

Re: Diff between unicharambigs and DangAmbigs

2013-05-07 Thread Nick White
Hi Matt, > I'm also not sure how these two files are different, or if maybe DangAmbigs is > from an earlier version of Tesseract or something. I'm using 3.02. Yes, that guess was correct. unicharambigs used to be called DangAmbigs before Tesseract 3. That is mentioned at: http://code.google.com/

Scrollview

2013-05-07 Thread sdk
Hi, I am trying to get scrollview to show up on win7. I am getting the error: Starting java -Xms512m -Xmx1024m -Djava.library.path=C:\Program Files (x86)\Tesseract-OCR\java -cp C:\Program Files (x86)\Tesseract- OCR\java/ScrollView.jar;C:\Program Files (x86)\Tesseract-OCR\java/piccolo-1.2.jar;C:\

Re: Ugly behavior when recognizing – advice requirement

2013-05-07 Thread Dmitri Silaev
Andres, Your code seems to be correct. I personally use a few more lines right after the call to GetIterator(): it->Begin(); if(it->IsAtFinalElement(RIL_BLOCK, RIL_SYMBOL)) return; if(!it->IsAtBeginningOf(RIL_SYMBOL)) return; But this shouldn't bother you if you rely on