You might want to try preprocessing with a threshold filter (otsu threshold) to
harden the edges?
Sent from my iPhone
> On 6 Apr 2017, at 10:16, Javier Abascal wrote:
>
> Hi everyone! :)
>
> I am having troubles identifying correctly the text in the images
The whole point of a captcha is to evade automated reading. That's why letters
are very close together and letters are heavily rotated off a consistent
baseline. OCR is designed for normal text input so you need to do clever
preprocessing here first.
Sent from my iPhone
> On 2 Jan 2017, at
Not sure it can but I wondered whether the scope of your legal regulation would
allow:
1. Encrypt the user words file or store in your source code
2. In you wrapper program just before tesseract api init decrypt the file to
tessdata
3. Init tesseract pointed to this file
4. Perform ocr
5.
eract to
> recognize?
>
>> On Thursday, November 10, 2016 at 1:03:43 PM UTC-8, Allistair C wrote:
>> What is it you are trying to achieve exactly?
>>
>>> On 10 November 2016 at 18:02, JF <jimfa...@gmail.com> wrote:
>>> I'm using Tesseract (3.04.01 with lepton
Do you have a sample image?
Sent from my iPhone
> On 19 Aug 2016, at 20:33, Lucas Alexandre wrote:
>
>
>Hello,
>
> I am a new member of this mailing list. I am creating a small project to read
> electronic screens through OCR. In other words, we set up some
Depends what part of the input image you are interested in?
Sent from my iPhone
> On 27 Jul 2016, at 16:28, Dorin Bujor wrote:
>
>
> input.jpg
>
>
>
>
>
>
> out.txt:
>
>
> Ansamhhll River"s Towers- mnel.na:he@f|deliacasa.m - Fideliacasa Mail -
> Goagle chrome
>
No idea what the best is but a google search lists a number of providers of
such:
Google for 'bank statement ocr'
You should see results like statement reader and smartex for instance.
Cheers
Sent from my iPhone
> On 20 Jul 2016, at 03:58, Dave Burleigh wrote:
>
Have you tried resizing your image to be larger, try x2 larger - can sometimes
help. Is this happening to all Ms or just one?
Sent from my iPhone
> On 14 Jul 2016, at 03:44, Raphael Budd wrote:
>
> So I added really strong pre processing that chops up the schedule,
Preprocessing with OpenCV before providing to Tesseract.
Sent from my iPhone
> On 6 Jul 2016, at 13:46, Mitesh Kalal wrote:
>
> I just started woking with tessaract. I am working on thresholding. How to
> give input and get output image inn otsu thresholding method?
>
iyar
>
>> On Thursday, 16 June 2016 03:41:36 UTC+5:30, Allistair C wrote:
>> Hi,
>>
>> Your question is a little difficult to understand - it sounds like you are
>> saying on the one hand you have no OCR or image processing background, know
>> Java, and want to modi
You have not included the full stack grace so you have not shown the error you
are getting, only the root call loading leptonica (did you include that lib?)
try sending the full stack.
Sent from my iPhone
> On 7 Apr 2016, at 21:39, Can wrote:
>
> Hi everyone. I have
I think your whole document needs enough surrounding margin - I found the empty
page issue when my text was too close to the page edges. In your first image
you have this but not your second.
Sent from my iPhone
> On 26 Oct 2015, at 18:30, Daniel Kraft wrote:
>
> Hi all!
Can you describe much better? What are your results looking like? What is the
target text you are trying to recognise?
> On 30 Sep 2015, at 16:27, George Tsai wrote:
>
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr"
Use opencv pattern matching
Sent from my iPhone
On 22 May 2015, at 02:35, SRguy sanderatla...@gmail.com wrote:
Might Tesseracts be trained to recognize emoticons, such as the new iPhone
ones?
Thanks.
--
You received this message because you are subscribed to the Google Groups
Try resampling your image up to 5x larger and try again.
Sent from my iPhone
On 2 May 2015, at 00:01, MartÃn Ochoa 8amar...@gmail.com wrote:
Hi,
I'm developing an app that will have to read text from image in order to do
some things that have nothing to do with my question. So I have that
What Tom said.
However, let's assume all your variables are constant - resolution has to
be just what you have, file format has to be TIF etc. then you can use a
divide and conquer distributed computing pattern. That is, grab a machine
that holds a queue of work and then make that queue farm
Of course, it's up to you which image or part thereof you send to tesseract.
You just need to use your vb image processing libraries to create a new image
from a rectangular region of the source image.
Sent from my iPhone
On 25 Mar 2015, at 22:07, Faissal Bouetire bouet...@gmail.com wrote:
I think you must be pulling our leg. Either that or you are still mistakenly
sending a jcpenney logo into OCR.
Sent from my iPhone
On 17 Feb 2015, at 07:20, pgpur...@gmail.com wrote:
Hi ,
I have tried to detect logo text from Kohl's logo attahced herewith, but it
returns JCPenney. Can
I would personally use opencv rather than IM. It has more sophisticated
routines to build on.
http://stackoverflow.com/questions/16746473/opencv-find-bounding-box-of-largest-blob-in-binary-image
Sent from my iPhone
On 8 Feb 2015, at 00:02, Josh Wolcott jswolc...@gmail.com wrote:
You know
be an out of box solution to command line OCR.
The project was going swimmingly until I actually got to this. My patience
is beginning to wain =(
On Sunday, February 8, 2015 at 4:23:25 AM UTC-5, Allistair C wrote:
I would personally use opencv rather than IM. It has more sophisticated
, February 8, 2015 at 7:11:28 AM UTC-5, Allistair C wrote:
Could you upload a scanned card at the resolution and angle that you tried
without success?
Sent from my iPhone
On 8 Feb 2015, at 12:05, Josh Wolcott jswo...@gmail.com wrote:
I will look in to opencv. Thank you.
I spent many
blank or total random. I have to
identify the blob some how. and I can not get opencv to download form any
mirror... what the heck. This project keeps getting better.
On Sunday, February 8, 2015 at 7:45:59 AM UTC-5, Allistair C wrote:
If you butt them up against each other horizontally
Don't waste your time on splicing and rotating. Focus on a reliable scan setup
for cropping. Tesseract already handles a degree of rotation correction, your
issue is all the noise so focus on that.
Sent from my iPhone
On 8 Feb 2015, at 19:19, Josh Wolcott jswolc...@gmail.com wrote:
I've
One option is try a different PSM mode - 6 may work well.
Or you have a card which is great because it means you have repeatable areas of
text. Processing the card into cropped areas is possible if your scanning is
controlled. Look at what http://card.io do to see an example of getting a good
, Allistair C wrote:
At what point will you use Google to answer these simple questions? OpenCV
has already been mentioned many times.
Sent from my iPhone
On 22 Jan 2015, at 18:39, newbie spens.ma...@gmail.com wrote:
Any idea of what free source is available for bininrizing in java
At what point will you use Google to answer these simple questions? OpenCV has
already been mentioned many times.
Sent from my iPhone
On 22 Jan 2015, at 18:39, newbie spens.mallang...@gmail.com wrote:
Any idea of what free source is available for bininrizing in java ?
Thanks
On
These are usually because libpng/libtiff Eric are not present, did you confirm
the leptonica installed those dependencies?
Sent from my iPhone
On 21 Jan 2015, at 05:56, Purohith Nayak purohith...@gmail.com wrote:
Hi,
I installed leptonica then tesseract and everything went well, But
://www.dropbox.com/s/w2r2kp5is96oh2t/faked.jpg?dl=0
Cheers
On Monday, 12 January 2015 15:34:12 UTC, Allistair C wrote:
Even totally cleaned up of the surrounding frame and gradiented backdrop
on the screen, Tesseract does not recognise the large numbers for me. That
may mean you need to acquire
Sorry wrong clean
image: https://www.dropbox.com/s/s7nzdqapr75yr23/clean.jpg?dl=0
On Monday, 12 January 2015 15:40:55 UTC, Allistair C wrote:
Just to back that up some more ...
Clean: did not work at all
https://www.dropbox.com/s/jz4e8mm9onga9md/code.png?dl=0
Clean with some paintwork
asap
2015-01-08 1:24 GMT+02:00 Gokcer Gunes ggunes...@gmail.com:
ah no its not noise there is no noise in original img it just
result of crop in paint
2015-01-08 1:21 GMT+02:00 Allistair C allist...@gmail.com:
Ah, I see - interesting :) The 2nd example isn't quite the
same - it seems
the issue is?
On 7 January 2015 at 22:48, Gokcer Gunes ggun...@gmail.com javascript:
wrote:
i uploaded them as pictures
2015-01-08 0:29 GMT+02:00 Gokcer Gunes ggun...@gmail.com javascript:
:
yeah resul pictures are in message you cant see them?
2015-01-07 23:56 GMT+02:00 Allistair C
Your question is not self-evident, what are you trying to ask? Can you
present your OCR results for each test you are conducting?
On Monday, 5 January 2015 18:16:12 UTC, Gokcer Gunes wrote:
https://lh3.googleusercontent.com/-QtKcTsT8fGY/VKrU2Bz4zZI/AD4/nCbru06vKac/s1600/testry2.png
The ... is formally called an ellipsis and I can find nothing useful
Googling except that somebody has tried using OpenCV object/feature
detection to try and look for this. The only possible way I can imagine
getting Tesseract to recognise an ellipsis is to train it where 3 full
stops appear
You've tried unicharambigs right (bottom of this page
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3)
On Thursday, 20 November 2014 12:53:43 UTC, Mark Beylis wrote:
Hello
I am making use of Tesseract OCR to perform number plate recognition on
vehicles
I am making use of
baseApi.init(filesDir.getPath() + /tesseract/, LANG);
baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_BLOCK);
baseApi.setImage(bmp);
OCRResult result = new OCRResult(baseApi.getUTF8Text(),
baseApi.meanConfidence());
baseApi.end();
Note OCRResult is my own object for holding values.
I wonder if there is anything consistent about the invoice design?
For instance I notice that your invoice has Honda logos on the top left
and top right essentially providing 2 anchors from which you could
extrapolate resolution and location/orientation of the table of data.
You could also
I think the table lines are not helping.
I up-sized your image to 1000px wide, then ran into Tesseract with PSM=6
and got mostly rubbish.
Then I removed the table lines manually in Photoshop, then up-sized your
image to 1000px wide, then ran into Tesseract with PSM=6:
RFZBHMEDBSR
R 134a/
Do you have higher resolution images to work with - that's one issue going
on here as the edges of your text are very fuzzy and at that resolution
it's pretty hard for Tesseract. You can also play with Thresholding and
Opening (Erosion/Dilation) to thicken some of your lines up (using e.g.
38 matches
Mail list logo