a
solution around it. It's reliable and well tailored for a class of problems
such as yours. Now it's being used as a workhorse by few our clients.
Look at www.CustomOCR.com. ScreenMine is not announced there yet, but there
you can find a contact form to reach us.
Regards,
Dmitri Silaev
On Fri, May 4
hem about it here: https://github.com/
> DanBloomberg/leptonica/issues/251
>
> Thank you so much!
>
> On Monday, October 23, 2017 at 11:58:22 AM UTC-4, Dmitri Silaev wrote:
>>
>> Image handling in Tesseract is done with Leptonica. I have little
>> knowledg
r 21, 2017 at 5:12:01 PM UTC-4, Dmitri Silaev wrote:
>>
>> Without delving deeper, I can suggest that you probably need to
>> investigate your image EXIF orientation value. Most image handling
>> libraries respect it. I suppose your image viewer also supports this
>> pa
m any kind of forum is just an
> ordinary work (think of stackoverflow, is it unfair to find an idea or a
> piece of code from there ?). Developing a full solution it's a different
> thing and it is what I will try to do.
>
> thanks for your time.
>
> On Wednesday, October 18,
Wow, we are being taken advantage of. Smart move Paolo but not fair. Heck,
I almost started writing the answer.
On Tue, Oct 17, 2017 at 7:00 PM, Tom Morris wrote:
> I don't suppose this has anything to do with the Top Coder Mud Logger OCR
> contest, does it?
>
Великий <alex.velic...@gmail.com>
wrote:
> Thank you, Dmitri.
>
> Is there a way to optimize speed of recognition? Like disabling OCR to
> seek for some patterns?
>
> On Wednesday, October 11, 2017 at 7:15:01 PM UTC+3, Dmitri Silaev wrote:
>>
>> See my previous answe
r each word, I would
> already solved the problem.
>
>
> On Saturday, October 14, 2017 at 10:29:29 PM UTC+2, Dmitri Silaev wrote:
>>
>> What are you unhappy with: detection rate or recognition accuracy? All in
>> all, there's a ton of reasons why Tess can work poorly he
OK, I think, it's a bunch of your fallacies. But let's start from the
beginning. Send the exact image you are passing to Tess, version of Tess,
your config file, and the command line.
-Dmitri
On Sun, Oct 15, 2017 at 11:42 AM, Dan9er wrote:
> Bump
>
> On Friday,
are going to look for,
- their bounding boxes within your sample image.
Once I have it, I might be able to help.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, Oct 13, 2017 at 9:05 AM, Paolo Giannoccaro <pa.giannocc...@gmail.com
> wrote:
> Hi,
> I need to detect a fixed
, Александр Великий <alex.velic...@gmail.com>
wrote:
> Should simply converting image to grayscale do the trick in such cases
> (bright colored background) or something else may be needed?
>
> On Wednesday, October 11, 2017 at 12:37:12 AM UTC+3, Dmitri Silaev wrote:
>>
t; Thank you very much.
>
> Indeed, the images that you provided were successfully parsed by
> Tesseract.
> Could you suggest tools that could programmatically change colors of
> image to get one like yours?
>
> On Wednesday, October 11, 2017 at 12:37:12 AM UTC+3, Dmitri Si
al bar (a cursor?) at the end of this
word. Get rid of the bar, and the results will be just perfect. See
"src_sat-100_nobar.jpg". Perhaps, you can make use of ImageMagic's
morphology to remove thin bars and the like.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Oct 10, 2
na ExtraIOMV Diesel 2017-09-3019:38
> 4.930 RON 5.220 RON
>
>
> 2)
> convert in.png -threshold 38.038% out.png
>
> Dataffimp Motorina Standard Motorina Extra/OMV Diesel 2017-09-3019:38
> 4.930 RON 5.220 RON
>
>
> For threshold >= 38,039 'Data/Timp' is correc
d be
happy to help.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, Sep 29, 2017 at 12:12 PM, Ben Schipper <schip...@londonhydro.com>
wrote:
> I am attempting to read a fairly large 6 digit number from an image using
> Tesseract 3.02 on a windows 7 machine.
>
> I have been
s often may confuse
foreground and background pixels - usually foreground is black.
Example command line: tesseract debug_i.png debug_i.png -psm 7
Tested with Tess executable built as of 20150203.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Nov 19, 2015 at 8:04 PM, Sean Leffler <s..
Shishir,
Do not hijack this thread. Go create a separate one with your own question.
-Dmitri
On Sat, Oct 3, 2015 at 10:19 AM, Shishir Singhal wrote:
> sir i am doing a project based on hand written character recognition based
> on google tesseract but i the
ymbols you are after.
The rest is trivial - count tactile symbols and get the denomination of
your bill.
Of course, you'd add more sophistication to cope with real world images but
the backbone of the algorithm looks to me like this. All work is done in
grayscale.
HTH
Best regards,
Dmi
The said preprocessing would be needed anyway even if Tesseract worked for
your "characters". Tell what you already have done so far in this direction
so I can share more details about the above method, if you wish.
-Dmitri
Hi Dmitri Silaev.
Thanks for reply. They are bills, sorry for mis
Hi Juan Pablo,
The problem seems interesting. However not sure if you can use Tesseract
for that. Could you show one or more example tickets?
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Sep 22, 2015 at 2:17 AM, Juan Pablo Aveggio <jpaveg...@gmail.com>
wrote:
> Hello
>
I know it's tempting to use Tesseract as a free off-the-shelf tool but it
comes at a cost of less accuracy. What I suggested gives an accuracy close
to 100%.
The choice is yours.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Sep 21, 2015 at 10:26 PM, Keith Reilly <krei...@retroreport
at the
bottom then it's usually alright.
And finally, I suppose Tesseract already has a pretty decent collection of
trained fonts to work with most meter types.
Regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Jul 21, 2015 at 9:40 AM, Marc Bruins marciebru...@gmail.com wrote:
Hello all,
I
23:39:33 UTC+3 tarihinde Dmitri Silaev yazdı:
As the first mandatory step you need to do perspective correction, e.g.
using paper sheet boundaries (is it a lottery ticket?)
Then depending on how it goes further with Tesseract you may need either
to:
- Train for this particular font
- Blur
vertically by a factor of 1.5 to match closer to standard
trained fonts
Each step in turn is a multi-step process. PM me if you're interested.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Jun 29, 2015 at 10:21 PM, Cenk KIZILDAĞ kizildagc...@gmail.com
wrote:
Hi,
I would like
- Cattoni, Coianiz
- Document Structure Analysis Algorithms - 2003 - Mao, Rosenfeld, Kanungo
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Jun 1, 2015 at 7:43 PM, S Kirkwood smkirkwood4...@gmail.com wrote:
Thank you for the response Dmitri.
It is reassuring to know that this can
. Be
inventive. Decent accuracy can be achieved. You should admit, though, a
less than 100% accuracy rate.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, May 29, 2015 at 10:57 PM, S Kirkwood smkirkwood4...@gmail.com
wrote:
Hi, I am working on a project that requires OCR. I have not used
You won't get any improvement just by changing a few params. A more complex
processing is required. Let me know if you're interested in more details.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, May 28, 2015 at 8:50 AM, supriya Das supriya.i...@gmail.com wrote:
Hello Everybody
Such params are not known to me. But if they were I'm pretty sure that
would be a quite unreliable solution. In my opinion just stick with the
solution you found yourself - split into fragments.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, May 27, 2015 at 6:00 PM, Brad brad.s
by programming but might be done by
means of ImageMagick/shell scripts also.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, May 28, 2015 at 2:47 PM, supriya Das supriya.i...@gmail.com wrote:
Hello Dmitri Siaev,
Thanks for your response. Please tell me the complex processing logic.
Thanks
Show the source image. Show what you have done to get the binarized version.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, May 21, 2015 at 1:55 AM, hj hsje...@gmail.com wrote:
see attached image. have tried various things, including this config:
tessedit_char_whitelist 0123456789
source image and run Tess in the single char PSM. I
think it's should be easy as long as location of every character is quite
stable among your source images. ImageMagick/shell scripts would suffice.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, May 20, 2015 at 12:52 PM, Yoann Nicod th3
20, 2015 at 12:29:08 PM UTC+2, Dmitri Silaev wrote:
One no-brainer method to try out would be turning off all dictionaries
and using your own custom user-patterns file. Since you said about your
application I suppose you can program. So you can take a look at the
comment preceding
on the internet
- look for them. They seem to address fonts similar to yours, but in the
end you'd probably need to train yourself.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, May 14, 2015 at 8:17 PM, James Okken jokke...@gmail.com wrote:
Dmitri,
thanks very much for your response. any
for you,
though; it depends on source image specifics. Attach several samples.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, May 13, 2015 at 8:31 PM, James Okken jokke...@gmail.com wrote:
hi everyone.
can tesseract pull the numbers off this thermostat picture attached? I've
tried
Great contribution! Thanks!
-Dmitri
On Wed, May 13, 2015 at 4:41 PM, Ryan Baumann rfbaum...@gmail.com wrote:
I wrote up my experiments with OpenCL-enabled Tesseract here:
http://ryanfb.github.io/etc/2015/03/18/experimenting_with_opencl_for_tesseract.html
On Friday, May 8, 2015 at 3:58:42
result.
tesseract inet009_rs_cr_ts.jpg inet009_rs_cr_ts.jpg -l fra
(inet009_rs_cr_ts.jpg.txt)
The lower word just being cropped out leads to normal recognition.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, May 2, 2015 at 2:01 AM, Martín Ochoa 8amar...@gmail.com wrote:
Hi,
I'm
)
- Run Tess - perfect
tesseract.exe inet010_ntransp_ts.png inet010_ntransp_ts.png
(inet010_ntransp_ts.png.txt)
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, May 4, 2015 at 1:29 PM, franck dev dev.franck...@gmail.com wrote:
Hi,
I tried with imagemagick:
-colorspace Gray
-negate
. If your other subtitle images have similar
structure this method should work regardless of char color.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, May 3, 2015 at 10:31 PM, franck dev dev.franck...@gmail.com wrote:
Hi, I have tried to do ocr on subtitles picture but depending
show
how if you're interested. For some clues on that see my post in this
thread:
https://groups.google.com/forum/#!msg/tesseract-ocr/STHaLGYsiCo/pCT2kxMgwI8J
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Apr 27, 2015 at 9:34 PM, Alexander Pico xanderp...@gmail.com
wrote:
I am trying
- go ahead. No math or other specific
knowledge required.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, Apr 24, 2015 at 1:00 AM, Leah Siddall
leah.sidd...@elementaltechnologies.com wrote:
*mind blown* this is a much better approach!! especially how quickly i
found something like
the point.
You'd better invest your time into accumulating a collection of score digit
coordinates in each game, than into a struggle with quirky OCR results.
Well, unless you're eager to.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Apr 23, 2015 at 10:51 PM, Leah Siddall
leah.sidd
,
Dmitri Silaev
www.CustomOCR.com
On Thu, Apr 23, 2015 at 9:05 AM, Leah Siddall
leah.sidd...@elementaltechnologies.com wrote:
Hi all!
I am not having luck with tesseract and the fonts used in NES games like
Super Mario Bros. 3. ( i've attached an example screenshot ).
My goal is scrape
regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Apr 21, 2015 at 6:17 AM, John James ashoutforh...@gmail.com wrote:
Hi All
I am looking for a parameter that sets the minimum acceptable rectangle
size that tesseract will interpret as a character.
For example every character in the image has
It seems you're confusing certainty and confidence here. Please pay
close attention to what you're writing or rephrase your question. The
formula itself allows no values out of the [0, 100] range.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Apr 8, 2015 at 8:37 AM, Gunasekaran Velu
used FastStone Image Viewer's Blur with a parameter of 14.
If you want to use ImageMagick - I don't know how it exactly relates to
Gaussian blur sigma, you have to experiment.
Then a standard command line for Tesseract works well. At least no more 8
vs. 3 errors.
Best regards,
Dmitri Silaev
regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Feb 23, 2015 at 5:35 PM, James Owers james.f.ow...@gmail.com
wrote:
Have cross posted this to StackOverflow:
http://stackoverflow.com/questions/28676158/compiling-tesseract-debugger-to-visualise-region-classification
On Wednesday, 18 February
Do this:
- Use higher resolution. You can get much better results upscaling 3x
- Use better image quality and format (lossless TIFF, PNG)
- Get rid of the vertical text at the left
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, Feb 14, 2015 at 8:29 AM, Gunasekaran Velu mail2vg
with color filtering, line detection and other steps which can
increase accuracy.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 8, 2015 at 12:23 PM, Allistair C allist...@gmail.com wrote:
I would personally use opencv rather than IM. It has more sophisticated
routines to build
Excuses, that should be *drafting tape*
On Sun, Feb 8, 2015 at 8:59 PM, Dmitri Silaev daemons2...@gmail.com wrote:
Well, the computer approach still has a lot of potential, hehe ))
Check this: http://www.fmwconcepts.com/imagemagick/unrotate/index.php
By using this script, you can drop your
own pitfalls. At least you can give it a try.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 8, 2015 at 6:47 PM, Josh Wolcott jswolc...@gmail.com wrote:
I agree. That seems like a very workable solution long term. I will work
on cropping more carefully and look in to a tray
Wow, a negative tray printed by a 3D printer! Cool idea, I like it! Should
make all things simple.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 8, 2015 at 5:43 PM, Allistair allist...@gmail.com wrote:
I agree, this cannot be too difficult to scan them in a repeatable,
oriented
result. And show us your fixed cropping results. I
suppose those should be 3 images per card - the two one liners and the long
description.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 8, 2015 at 3:48 PM, Josh Wolcott jswolc...@gmail.com wrote:
Trust me I tried. Seemed like a simple
, leave others as is.
Through the Cygwin terminal, the script runs like a charm.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 8, 2015 at 10:19 PM, Josh Wolcott jswolc...@gmail.com wrote:
I've seen some of Fred's stuff and he does some impressive work. However,
I have to run
with ImageMagick, feed them to Tesseract one by one et voila!
The text is clear enough to be processed by Tesseract without any further
preprocessing.
OneNote just has a better text detection routine, so that it gets less
confused by graphics.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sat
to place the cards
evenly?
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, Feb 7, 2015 at 10:57 PM, Josh Wolcott jswolc...@gmail.com wrote:
My issue with cropping is that due to the variances in where the images
are I end up with a large variance in the images. I'll attach two examples
Works perfectly out of the box with the latest repository version, even
without digit (i.e. whitelist). What version do you use?
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Feb 2, 2015 at 9:01 PM, Simon Hill simonhill...@gmail.com wrote:
Sorry if this has been asked before
using another OCR engine
allowing to separate and tune the text detection stage. One or more
versions of Abbyy software can do this.
Best regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, Jan 9, 2015 at 11:44 PM, J. Heald j.he...@ucl.ac.uk wrote:
Sorry if last post was TL;DR
But the basic
on. For some types of images
probably it can work. Are you sure you can't remember anything?
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Nov 6, 2013 at 7:31 PM, Andreas Romeyke
art1pi...@googlemail.com wrote:
Hello Dmitri,
Am Donnerstag, 31. Oktober 2013 09:01:44 UTC+1 schrieb Dmitri
environment.
To configure your system's hardware, you'll need a clean machine (or
many of diverse types) and quite a few experiments to understand CPU
and memory consumption for your types of images.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Oct 31, 2013 at 10:36 AM, Niral Prajapati
Excellent post, Nick! The more I read, the more I felt I had to ask
these questions myself, but didn't yet. I'm afraid, though, many of
them would remain unanswered.
Because after several years of monitoring and asking in this forum I
got used to the feeling that principal developers make only
Nick,
In image processing ROI usually means region of interest
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Apr 3, 2013 at 11:23 PM, Nick White nick.wh...@durham.ac.uk wrote:
Hi Dmitri,
Can you explain what ROI is in this context please? I'm not familiar
with the term.
Thanks
trying to tweak diverse parameters with no stable
effect, making progress with some images and failing with others.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Apr 3, 2013 at 12:30 AM, Art Solano amscloudn...@gmail.com wrote:
We are looking to use Tesseract for processing travel document
Use page segmentation mode 5, 6 or 7 (the -psm command line switch).
Tesseract's automatic layout analysis fails for this image so you have to
specify the layout manually.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Feb 20, 2013 at 6:42 PM, Andrea Fontana trik...@gmail.com wrote
You cannot do this with the stock Tesseract. A specifically designed image
processing pipeline needs to be implemented to extract text for subsequent
recognition by Tesseract.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Feb 19, 2013 at 12:05 PM, Romeo Jihara rjih...@gmail.com wrote
to 10% error rate.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Feb 19, 2013 at 10:19 PM, Carlos Antunes cf.antu...@gmail.comwrote:
Hello all,
While generating the TR for a TIF/BOX pair using a large text, there are
some errors when the box cannot be made and hence some
,
Dmitri Silaev
www.CustomOCR.com
On Fri, Jan 11, 2013 at 11:44 AM, wowgreat...@gmail.com wrote:
sometime it takes a long time to run OCR
so how to get the progress during OCR?
thanks !
I'm using Tesseract3.02win7 b4bit VS2010
--
You received this message because you
input for Tesseract. Otherwise damaged versions of same characters would
differ much so you'd need to train Tess for every such version. This in
turn would certainly lead to an accuracy drop and you'd waste much time
struggling with all kinds of OCR issues.
Warm regards,
Dmitri Silaev
to avoid any quirky configurations (shadows, extreme flares, etc.)
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Dec 17, 2012 at 8:02 AM, Neo Song neo.f...@gmail.com wrote:
Dear Dmitri,
There is one thing that confuses me heavily. For a Coaxial light
source, I can get solid stroke
connected. But be prepared that no
perfect character contours can be obtained, like with any other edge
detection procedure.
HTH and good luck!
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Dec 13, 2012 at 1:35 PM, Neo Song neo.f...@gmail.com wrote:
Hi gadv,
I have used SWT
this holds true if your other images do not differ much from
what you've shown here in the forum.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Nov 29, 2012 at 1:07 PM, sascha4j sascha.j...@gmx.net wrote:
thank you for your answer i will take a look at your example and the
leptonica library
Same here. Please share.
Thanks,
Dmitri
On Fri, Nov 16, 2012 at 7:48 PM, Sven Pedersen sven.peder...@gmail.comwrote:
I'm curious to hear about it -- I used to work in the document processing
industry. Please send the info to me.
Thanks,
Sven
On Fri, Nov 16, 2012 at 7:33 AM, José Luis Rey
.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Oct 18, 2012 at 10:43 PM, Andres andrej...@gmail.com wrote:
Thank you very much Dmitri. I'll try it a little more with your hints and if
I arrive to some conclusion I'll let the list know.
A pair of extra questions:
- Do you know
, with no memory hogging (as an answer to other forum
thread), using select parts of it, though.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Oct 17, 2012 at 5:37 PM, Attila Somogyi bmfneum...@gmail.com wrote:
Hello!
My application processes images in a very small interval, about 1 sec
luck. Don't forget about real samples. All correspondence -
please post into the forum.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Oct 15, 2012 at 11:31 PM, Andres andrej...@gmail.com wrote:
Hello fellows,
Sometimes:
‘6’ is recognized as ‘8’, ‘3’ as ‘9’, and some other similar
You can check out the Wiki article on DAWGs to see that the reverse
conversion to word lists generally is not unique.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Jul 29, 2008 at 3:49 PM, Donatas G. dgvirt...@gmail.com wrote:
is it possible to extract/decompile a dawg file? I would
Check if those TIFFs are uncompressed. Surprisingly, sometimes it
fails with uncompressed TIFFs.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Jul 2, 2012 at 7:02 PM, js jigij...@gmail.com wrote:
downloaded code and compiled. upon running tesseract.exe with following
command
C
Actually your second argument looks strange. It should be the basename
of an output file. You've indicated
C:\Users\username\Desktop\Test_images\test.tif. This can work but is
this intended? Language argument can be omitted, thus defaulting to
eng
--
Dmitri
On Mon, Jul 2, 2012 at 7:02 PM, js
words within the document
during recognition. Then it can use them at the second (adaptive)
pass. This can benefit only in case of repeated occurrences of
particular words in the document.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Mon, Jun 25, 2012 at 4:27 AM, TDG threedaygo...@gmail.com
block_edges() has nothing to do with edge detection. Tesseract does
not use it at all. It first binarizes entire images then extracts
connected components (CCs). block_edges() is called to extract CCs'
outlines from a binarized image.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, Jun 23
character having very distinctive shape compared to digits and
preferably of the same width with digits.
You've asked lots of questions but this is what I'd start working with.
HTH
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Jun 20, 2012 at 9:00 AM, TDG threedaygo...@gmail.com wrote
that's related
to naming and notions, though.
What you have shown in your image is not what is produced by
extract_edges() or block_edges(). Those build completely different
structures, similar to that is commonly known as crack coded CC boundaries.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
,
Dmitri Silaev
www.CustomOCR.com
On Thursday, June 21, 2012 7:06:24 PM UTC+4, islam ibrahim wrote:
Hello
I have a question regarding the font size that Tesseract supports. Is
there a specific size or is it just working whatever font size or even type
used?
Thanks in advance
--
You received
This means that the tord_display_ratings parameter no longer exists in
the current version Tesseract. Probably you use outdated config files
(inter or matdemo.) Try to delete corresponding line from these files.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thursday, June 21, 2012 1:51:52
baselines and
outlines
- Keep clicking words in the main image view to see their baselines
For details please refer to
http://rdaemons.blogspot.com/2012/06/tesseract-ocr-interactive-debugging.html
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thursday, June 21, 2012 10:28:13 AM UTC
the speckle as a part of character's shape,
and therefore it would be trained incorrectly.
So the best would be to clean up the image before passing it to
Tesseract. You can use ImageMagick or whatever tool you like.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Mar 7, 2012 at 9:11 PM
My bad, I had missed that feature. tessedit_page_number indeed
allows to specify a TIFF page. I can only add a bit of clarification:
the page number is zero-based. The value of -1 (default) instructs
Tesseract to process all TIFF pages.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Mar
.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Mar 8, 2012 at 8:32 PM, Paul pafow...@googlemail.com wrote:
Thank you gents that will work for me, I will give it a try. Is there
somewhere I can find some documentation on things like config-page.txt
etc. I have Googled it but am not finding
No, at this time it is not possible to do via command line. However it
can be easily achieved by means of programming.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Mar 7, 2012 at 6:39 PM, Paul pafow...@googlemail.com wrote:
Hi,
Is there any way to instruct tesseract via the command
to resort to
a dictionary or context.
HTH
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Mar 4, 2012 at 10:02 PM, Falke hawk...@flight.us wrote:
My subject looks deceptively like a stupid question -- but it really
isn't:
Supposing you need to recognize a bunch of existing scanned documents
Sure, you just can post a feature request in the Issues section at the
project's web page.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Mar 7, 2012 at 9:42 PM, Paul pafow...@googlemail.com wrote:
Thanks for the info. Do I assume then that it would be a fairly
trivial task
and thresholder.cpp is
new and better documented so there's should be no problem to
understand it after a while.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Thu, Feb 23, 2012 at 12:35 AM, avasilev alxvasi...@gmail.com wrote:
First of all, I beg for excuse if this post appears twice, because I
Jason doesn't seem to be a developer so I think these are no options
for him. Otherwise the choice is limitless including 3rd party image
processing libraries and of course self-made custom algorithms.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Sun, Feb 19, 2012 at 11:41 AM, TP wing
regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, Feb 18, 2012 at 11:43 PM, Jason Funk jasonlf...@gmail.com wrote:
I am testing tesseract against some other commercial products and the
commercials products seems to blow tesseract out of the water in terms
of quality and accuracy. Is this because
Check this thread
https://groups.google.com/forum/?fromgroups#!topic/tesseract-ocr/TY_RIHOOyNM
Read about the psm switch and custom segmentation. Likely these can help you
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Feb 14, 2012 at 11:42 AM, ReneFR rspr...@veloeco.fr wrote
Did you try the psm switch (look for it in the forum)? Your own
segmentation? Both combined?
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Feb 14, 2012 at 1:55 AM, John Williams jdwilliams1...@gmail.com wrote:
If I duplicate the column 9 times, so that there's ten columns
using Tesseract's classifier exclusively
with highest possible efficiency.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Tue, Nov 29, 2011 at 1:04 AM, daniel danieloberh...@googlemail.com wrote:
Ok, so I thought more on this. What I will end up with is segments of
possible various colors
, particularly his Two Geometric Algorithms
for Layout Analysis (2002), maybe also his Layout Analysis based on
Text Line Segment Hypotheses (2003.) So you can even implement these
approaches yourself using these articles.
HTH
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Fri, Nov 18, 2011 at 9:26
can happen when an image containing non-text
information is fed to Tesseract; in this case all kinds of errors can
arise.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Nov 16, 2011 at 3:03 AM, walter23 walte...@gmail.com wrote:
I'm getting a message where the inclusion of complex
where more than two colors are
involved. I would have to map the discovered segments to two colors,
which may even be impossible. And with contours even more so, as the
contours may not be closed...
On 12 Nov., 18:26, Dmitri Silaev daemons2...@gmail.com wrote:
If you're able to use OpenCV
If you're able to use OpenCV then, given a list of contours or blobs,
you should be able to reconstruct a binary image. This is a general
thought. To get a more practical advice, send us your sample image(s)
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Sat, Nov 12, 2011 at 4:37 PM, daniel
start your search,
for example, from An Adaptive Binarization Technique for Low Quality
Historical Documents, by three Greek scientists, 2004, and similar
articles.
Warm regards,
Dmitri Silaev
www.CustomOCR.com
On Wed, Nov 9, 2011 at 2:49 PM, Esteban Bordón ebor...@gmail.com wrote:
Hi,
I send 2
1 - 100 of 230 matches
Mail list logo