[tesseract-ocr] totally different recognition rate when I rolled back to a previous version of tesseract.

2014-07-28 Thread Jing JC
I installed tesseract 3.03 and then did an uninstall. Since I compile from source and install tesseract to my assigned path, I removed the tesseract header files, libs and etc cleanly. And I re installed tesseract 3.02. while the result is not same anymore. and unfortunately . A lot bad.

Re: [tesseract-ocr] hey, does ScrollView in tesseract 3.03 aimed for graphic view?

2014-07-23 Thread Jing JC
e.google.com/p/tesseract-ocr/wiki/ViewerDebugging > [2] https://docs.google.com/file/d/0B7l10Bj_LprhbUlIUFlCdGtDYkE/edit > TutorialSlides.tar.gz e.g. 7LayoutAnalysis.pdf > > Zdenko > > > On Wed, Jul 23, 2014 at 12:46 AM, Jing JC > wrote: > >> I am using a serve

Re: [tesseract-ocr] I compiled and installed tesseract from the source on CentOS. I kept both 3.01 and 3.02 versions. I use environment path stored in bash file to point to the version in use.

2014-07-22 Thread Jing JC
ract is shell wrapper script, and it will > take care that correct shared library is used (without installation...). > > Zdenko > > > On Wed, Jul 16, 2014 at 12:31 AM, Jing JC > wrote: > >> I compiled and installed tesseract from the source on CentOS. I kept bo

[tesseract-ocr] hey, does ScrollView in tesseract 3.03 aimed for graphic view?

2014-07-22 Thread Jing JC
I am using a server without graphic. So do I just disable the graphic during configure: ./configure --disable-graphics what else will it affect me? 2. if my training text is just one windows/non multipage, I do not need it at all? 3. what others will be affected? Thank you -- You receive

Re: [tesseract-ocr] what does "width= right -left => no silly +1/-1" mean in this tutorial?

2014-07-21 Thread Jing JC
right - left - 1 > > Am Donnerstag, 17. Juli 2014 18:30:56 UTC+2 schrieb Nick White: >> >> On Wed, Jul 16, 2014 at 11:17:00PM -0700, Jing JC wrote: >> > I am going through Ray Smith's tutorial, and don't get it? >> >> He means that as the co-ordinate syst

Re: [tesseract-ocr] what does "width= right -left => no silly +1/-1" mean in this tutorial?

2014-07-21 Thread Jing JC
thank you On Thursday, 17 July 2014 09:30:56 UTC-7, Nick White wrote: > > On Wed, Jul 16, 2014 at 11:17:00PM -0700, Jing JC wrote: > > I am going through Ray Smith's tutorial, and don't get it? > > He means that as the co-ordinate system uses bottom left as the &g

[tesseract-ocr] installing tesseract 3.03 for text2image function

2014-07-18 Thread Jing JC
Hey Nick, I saw you mentioned "text2image doesn't use scrollview; you compiled it wrong. Try make clean and then make training again." in one of your earlier answers. The following installing instructions are from Ray's tutorial. It has scrollview within it: is that fine? or I ignore

[tesseract-ocr] Re: error when shape clustering

2014-07-17 Thread Jing JC
matrx60x40 again, that's the only difference. and not working again. Why is that? On Thursday, 17 July 2014 11:57:04 UTC-7, Jing JC wrote: > > [root@centos57 AdaleMono]# shapeclustering -F font_properties -U > unicharset eng.matrx60x40.exp0.tr > Reading eng.matrx60x40.exp0.tr ...

Re: [tesseract-ocr] JTessbox Modifying the boxes

2014-07-17 Thread Jing JC
yep yep. it happened during the bounding boxes I generated myself. not happened to the .box during the training step yet. On Thursday, 17 July 2014 10:17:39 UTC-7, Nick White wrote: > > On Thu, Jul 17, 2014 at 12:14:43AM -0700, Jing JC wrote: > > The Ray's tutorial said

[tesseract-ocr] Re: JTessbox Modifying the boxes

2014-07-17 Thread Jing JC
ng > samples, if that is possible. > > I think the presentation you are mentioning refers to the recognition, not > the training step. During recognition, Tesseract gathers information about > bounding boxes around symbols, words, lines and blocks. Those boxes can of &g

[tesseract-ocr] error when shape clustering

2014-07-17 Thread Jing JC
[root@centos57 AdaleMono]# shapeclustering -F font_properties -U unicharset eng.matrx60x40.exp0.tr Reading eng.matrx60x40.exp0.tr ... Error: Unable to open eng.matrx60x40.exp0.tr! signal_termination_handler:Error:Signal_termination_handler called:Code 3000 Segmentation fault (snaplocaldev)[root@c

[tesseract-ocr] what does "width= right -left => no silly +1/-1" mean in this tutorial?

2014-07-17 Thread Jing JC
I am going through Ray Smith's tutorial, and don't get it? anyone sheds some light on it? thank you.

[tesseract-ocr] how is does tesseract make decision when classifying something?

2014-07-17 Thread Jing JC
seems not only does eng.cube.freq-words work. it is depended on other factors. too -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To

[tesseract-ocr] JTessbox Modifying the boxes

2014-07-17 Thread Jing JC
The Ray's tutorial said the bounding box overlaps. so when I modify the box inside JTessbox, do I keep the overlapping boxes, or make the boxes non touchi

[tesseract-ocr] does tesseract has cache thing?

2014-07-17 Thread Jing JC
I am exhausted figuring out how the user-words and user-patterns work? I did over 10 different experiments. the result never matched the word I put in user-words. What are the possible reasons? Thanks again in advance. -- You received this message because you are subscribed to the Google Gr

[tesseract-ocr] anyone sheds light on their experiments/experiences with tesseract 3.03. just gonna use text2image function in 3.03, does it still worth to upgrade?

2014-07-17 Thread Jing JC
I do not need a zillion fonts or images though, just train some numbers. I read through all the posts. didn't see much cons for upgrading to 3.03 though. any hints? thanks in advance. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. T

[tesseract-ocr] I compiled and installed tesseract from the source on CentOS. I kept both 3.01 and 3.02 versions. I use environment path stored in bash file to point to the version in use.

2014-07-15 Thread Jing JC
I compiled and installed tesseract from the source on CentOS. I kept both 3.01 and 3.02 versions. they r in separate folders. so far no problems. While, in FAQ, I read a post suggesting keep one version only. Do I need to remove the deprecated version in this case? Thank you. "Tesseract

Re: [tesseract-ocr] questions when reading unicharset manual: https://tesseract-ocr.googlecode.com/svn-history/r683/trunk/doc/unicharset.5.html

2014-07-15 Thread Jing JC
41:37 UTC-7, Nick White wrote: > > Hi, > > On Tue, Jul 15, 2014 at 10:04:24AM -0700, Jing JC wrote: > > yep yep. > > > > Thanks a lot Nick. > > > > I tried to cancel mu post last night. > > but seems I can not get access to it after posted b

Re: [tesseract-ocr] questions when reading unicharset manual: https://tesseract-ocr.googlecode.com/svn-history/r683/trunk/doc/unicharset.5.html

2014-07-15 Thread Jing JC
exadecimal and you get 10. b has isalpha and islower set, so it > is 00011. > > Does that make sense to you? > > Nick > > On Mon, Jul 14, 2014 at 09:54:40PM -0700, Jing JC wrote: > > The example given are: > > > > ; 10 Common 46 > > b 3 Latin 59

[tesseract-ocr] questions when reading unicharset manual: https://tesseract-ocr.googlecode.com/svn-history/r683/trunk/doc/unicharset.5.html

2014-07-15 Thread Jing JC
The example given are: ; 10 Common 46 b 3 Latin 59 W 5 Latin 40 7 8 Common 66 = 0 Common 93 ";" is a punctuation character. Its properties are thus represented by the binary number 1 (10 in hexadecimal). "b" is an alphabetic character and a lower case character. Its properties are thus r

Re: [tesseract-ocr] is tesseract 3.03's source tar available? need to compile on CentOS 5.6

2014-07-14 Thread Jing JC
I uploaded receipt images, google drive didn't do a good job on it. I am planning to compile the 3.03 version on the server and compare the result again by then. On Sunday, 13 July 2014 07:50:37 UTC-7, peiman F. wrote: > > unforgettably bad Different... > for english with 3.02 the best perform

[tesseract-ocr] Re: is tesseract 3.03's source tar available? need to compile on CentOS 5.6

2014-07-14 Thread Jing JC
lso need to update the Leptonica > <http://www.leptonica.com/> image processing library. > I compiled it on Ubuntu 12.04 using the tesseract compiling instructions > <https://code.google.com/p/tesseract-ocr/wiki/Compiling>. > > Good luck, > > Chris > >

[tesseract-ocr] is tesseract 3.03's source tar available? need to compile on CentOS 5.6

2014-07-12 Thread Jing JC
google's tesseract download page listed up 3.02 only. I need to compile tesseract on CentOs5.6 where is the download link for tesseract 3.03 or not available yet. thank you -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe fro

[tesseract-ocr] Re: How to apply user patterns

2014-07-11 Thread Jing JC
I have the same question. Any answers? I tried to make tesseract to match the words in my own customized user-words, but it returned the same result. I can not see the effect of the user-words and user-patterns. On Tuesday, 3 June 2014 03:54:24 UTC-7, Christopher Smeenk wrote: > > I would am a

[tesseract-ocr] Tesseract bazaar option: how to make tesseract match the words in user-words first?

2014-07-11 Thread Jing JC
The image I am going to pass to tesseract is: tesseract CREDIT.png output bazaar I put cred1t into user-words. I changed the letter 'I' to '1' on purpose to see whether tesseract can match th