[tesseract-ocr] Limit on number of whitelist characters

2014-09-09 Thread Reuben L.
Hi all experts, I would like to clarify if there is a limit to the number of whitelisted characters when using the *tessedit_char_whitelist* parameter in the config file. In my case, I noticed that once the number of whitelisted characters exceeds ~1300, an error read_params_file: parameter

[tesseract-ocr] compile error under ubuntu 14.04

2014-09-09 Thread Shree Devi Kumar
​Srirangaji tried to build the current git source on ubuntu 14.04 and it is getting an error. I downloaded the same version and am able to compile clean on windows8 under msys2. Here is the difference in the make.log where the errors occur: On Ubuntu 14.04 mv -f

[tesseract-ocr] How to remove small fonts in Images

2014-09-09 Thread Dineshkumar
What steps will reproduce the problem? 1. Use Tesseract OCR in any platform. 2. Use an image which has bullets numbering in small fonts 3. The output contains the numbering What is the expected output? What do you see instead? Expected an output without numbering i.e, How to remove the letters

[tesseract-ocr] Tesseract recognizes the characters irrespective of the lines

2014-09-09 Thread Dineshkumar
What steps will reproduce the problem? 1. Run the Tesseract OCR in Java for the attached image 2. Save the OCR result in a text file 3. Check the order of the output text file with the attached image. What is the expected output? What do you see instead? Expected output -- Expected the result

Re: [tesseract-ocr] Version 3.03 windows compilation

2014-09-09 Thread Paul
You need to install the corresponding files into the directories tessdata/configs/ and tessdata/. Paul Am Montag, 8. September 2014 17:46:34 UTC+2 schrieb Cristovão Oliveira: Hi, I was able to compile successfully on windows (revision 1123). Now i am testing version 3.03, specifically pdf

[tesseract-ocr] Re: compile error under ubuntu 14.04

2014-09-09 Thread shree
Also filed as an issue with additional information and log files https://code.google.com/p/tesseract-ocr/issues/detail?id=1307start=100 -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop receiving emails

[tesseract-ocr] Re: compile error under ubuntu 14.04

2014-09-09 Thread Jeff Breidenbach
This error comes from Leptonica 1.70. Tesseract now requires Leptonica 1.71. Leptonica 1.71 can be installed manually (but not so easily) and will ship with Ubuntu for their 14.10 release scheduled for October 23 of this year. -- You received this message because you are subscribed to the

Re: [tesseract-ocr] Re: compile error under ubuntu 14.04

2014-09-09 Thread Shree Devi Kumar
Thanks, Jeff. Zdenko also indicated in a private email that Sriranga Ji may have another (older) version of leptonica. He does have Leptonica 1.71 as compiled by him recently as well as reported by tesseract version that he compiled in May. dell14-04@dell1404-OptiPlex-330:~$ tesseract -v

Re: [tesseract-ocr] Is any one working on a deep neural net implementation?

2014-09-09 Thread Barrie Treloar
Are you making any progress with this? On Monday, June 9, 2014 2:52:49 AM UTC+9:30, Debayan wrote: Ok I got it https://tesseract-ocr.googlecode.com/files/boxtiff-2.01.eng.tar.gz On 6 June 2014 15:17, Debayan Banerjee deba...@gmail.com javascript: wrote: Nick, Can you point me to a