Re: what am i missing? tesseract runs but no output

2011-02-18 Thread Quan Nguyen
I ran test_page.png through VietOCR 3.1 with Screenshot Mode enabled and got acceptable results back. Since it's a Java program, it certainly can run on OS X, provided that you build the Tess engine. And if Ghostscript is installed, VietOCR can read PDF too. On Feb 18, 10:54 am, Bob Kuo wrote: >

Re: what am i missing? tesseract runs but no output

2011-02-18 Thread Bob Kuo
Thanks everyone! I tried it again, got a slightly different section from the original PDF and saved it as a PNG with 200 DPI. Then I ran convert with the following options: convert -density 200 -units PixelsPerInch -type Grayscale +compress test2.png test_input2.tif I had to put in the -density

Re: Wrappers for tessearct3.01?

2011-02-18 Thread devTess
Hi, I am getting many questions for help on how to make a simple wrapper, where to find information of many parameter terms used in the api. I have asked them to address the questions here. The developers need feedbacks to refine the software. For clarification, I am struggling to understand mysel

Re: what am i missing? tesseract runs but no output

2011-02-18 Thread Sriranga(78yrsold)
I checked in FreeOCR(which has tess 3.01 alpha) and found to be in order with few minor mistakes. With help of Irfanview - increased to 300dpi from 72dpi and saved as tif file(uncompressed) and tested. What zdenko says is correct. -sriranga(78yrs) On Fri, Feb 18, 2011 at 9:27 PM, zdenko podobny

Re: what am i missing? tesseract runs but no output

2011-02-18 Thread zdenko podobny
Hi, Just a quick reply: I tried it on Windows XP with tesseract 3.00 and it produced bad result (nothing usefull). InfranView informations dialog showed that image has resolution 72x72 DPI -> to low... So I resampled it (with Lanczos algorithm) from 100% to 300% size, set DPI to 300 and decrease

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
great... -- Regards, Saurabh Gandhi On Fri, Feb 18, 2011 at 5:16 PM, Jose wrote: > you now Saurabh, that was EXACTLY was I was looking for! I couldn't be more > thankful to you! that line of code changed my life :D > > thank you again :) > -- You received this message because you are subsc

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Yes, thats right. -- Regards, Saurabh Gandhi On Fri, Feb 18, 2011 at 4:57 PM, Jose wrote: > ok I'll try that! I have to modify this on the tesseractmain.cpp right? > (I'm using command line execution) > > I replace this line : api.SetPageSegMode(tesseract::PSM_AUTO); > for api.SetPageSegMode

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Did you try PSM_SINGLE_COLUMN. I think that is what you need. Could you try this and let us know how it behaves please. PSM_SINGLE_COLUMN, ///< Assume a single column of text of variable sizes. -- Regards, Saurabh Gandhi On Fri, Feb 18, 2011 at 4:29 PM, Jose wrote: > Is there no other work

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
Hello Jose, Setting the mode to PSM_SINGLE_BLOCK or PSM_SINGLE_LINE will not force horizontal reading. These modes will just assume that your input image itself is segmented and consists of just a single line. So, if you want horizontal reading you will have to segment your image and provide it to

Re: Customising Tesseract for character recognition

2011-02-18 Thread Saurabh Gandhi
You can simply use this in your program just after init to set whitelist / blacklist: *api.Init(argv[**0**],** **lang,** **&(argv[arg]),** **argc-arg,** **false** );** **api.SetVariable(**"tessedit_char_whitelist"**,** ** "ABCDEFGHIJKLMNOPQRSTUVWXYZ.0123456789 "**);* -- Regards, Saurabh Gandhi

Re: Customising Tesseract for character recognition

2011-02-18 Thread Sriranga(78yrsold)
*Customise the tesseract engine to recognize only the characters from **A-Z,0-9,.(dot), (space) by setting the character white-list * Kindly furnish the name of the folder in which whitelist as well as blacklist are existed. I want to utilise the same for Kannada scripts. -sriranga(78yrs) On Fr