I ran test_page.png through VietOCR 3.1 with Screenshot Mode enabled
and got acceptable results back. Since it's a Java program, it
certainly can run on OS X, provided that you build the Tess engine.
And if Ghostscript is installed, VietOCR can read PDF too.
On Feb 18, 10:54 am, Bob Kuo wrote:
>
Thanks everyone! I tried it again, got a slightly different section
from the original PDF and saved it as a PNG with 200 DPI. Then I ran
convert with the following options:
convert -density 200 -units PixelsPerInch -type Grayscale +compress
test2.png test_input2.tif
I had to put in the -density
Hi, I am getting many questions for help on how to make a simple
wrapper, where to find information of many parameter terms used in the
api. I have asked them to address the questions here. The developers
need feedbacks to refine the software.
For clarification, I am struggling to understand mysel
I checked in FreeOCR(which has tess 3.01 alpha) and found to be in order
with few minor mistakes.
With help of Irfanview - increased to 300dpi from 72dpi and saved as tif
file(uncompressed) and tested.
What zdenko says is correct.
-sriranga(78yrs)
On Fri, Feb 18, 2011 at 9:27 PM, zdenko podobny
Hi,
Just a quick reply:
I tried it on Windows XP with tesseract 3.00 and it produced bad result
(nothing usefull).
InfranView informations dialog showed that image has resolution 72x72 DPI ->
to low...
So I resampled it (with Lanczos algorithm) from 100% to 300% size, set DPI
to 300 and decrease
great...
--
Regards,
Saurabh Gandhi
On Fri, Feb 18, 2011 at 5:16 PM, Jose wrote:
> you now Saurabh, that was EXACTLY was I was looking for! I couldn't be more
> thankful to you! that line of code changed my life :D
>
> thank you again :)
>
--
You received this message because you are subsc
Yes, thats right.
--
Regards,
Saurabh Gandhi
On Fri, Feb 18, 2011 at 4:57 PM, Jose wrote:
> ok I'll try that! I have to modify this on the tesseractmain.cpp right?
> (I'm using command line execution)
>
> I replace this line : api.SetPageSegMode(tesseract::PSM_AUTO);
> for api.SetPageSegMode
Did you try PSM_SINGLE_COLUMN. I think that is what you need. Could you try
this and let us know how it behaves please.
PSM_SINGLE_COLUMN, ///< Assume a single column of text of variable sizes.
--
Regards,
Saurabh Gandhi
On Fri, Feb 18, 2011 at 4:29 PM, Jose wrote:
> Is there no other work
Hello Jose,
Setting the mode to PSM_SINGLE_BLOCK or PSM_SINGLE_LINE will not force
horizontal reading. These modes will just assume that your input image
itself is segmented and consists of just a single line. So, if you want
horizontal reading you will have to segment your image and provide it to
You can simply use this in your program just after init to set whitelist /
blacklist:
*api.Init(argv[**0**],** **lang,** **&(argv[arg]),** **argc-arg,** **false**
);**
**api.SetVariable(**"tessedit_char_whitelist"**,** **
"ABCDEFGHIJKLMNOPQRSTUVWXYZ.0123456789 "**);*
--
Regards,
Saurabh Gandhi
*Customise the tesseract engine to recognize only the characters from
**A-Z,0-9,.(dot),
(space) by setting the character white-list * Kindly furnish the name of
the folder in which whitelist as well as blacklist are existed. I want to
utilise the same for Kannada scripts.
-sriranga(78yrs)
On Fr
11 matches
Mail list logo